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PROTEIN PHOSPHATASES 

TECHNICAL FIELD 
This invention relates to nucleic acid and amino acid sequences of protein phosphatases and 
5 to the use of these sequences in the diagnosis, treatment, and prevention of immune system disorders, 
neurological disorders, developmental disorders, and cell proliferative disorders, and in the 
assessment of the effects of exogenous confounds on the expression of nucleic acid and amino acid 
sequences of protein phosphatases. 

10 BACKGROUND OF THE INVENTION 

Reversible protein phosphorylation is the ubiquitous strategy used to control many of the 
intracellular events in eukaiyotic cells. It is estimated that more than ten percent of proteins active in 
a typical mammalian cell are phosphorylated. Kinases catalyze the transfer of higji-energy phosphate 
groups from adenosme triphosphate (ATP) to target proteins on Ifie hydroxyamino acid residues 

15 serine, threonine, or tyrosine. Phosphatases, in contrast, remove these phosphate groups 

Extracellular signals including hormones, neurotransmitters, and growth and differentiation factors 
can activate kinases, which can occur as cell surface receptors or as the activator of the final effector 
protem, but can also occur along the signal transduction pathway. Cascades of kinases occur, as well 
as kinases sensitive to second messenger molecules. This system allows for the amplification of weak 

20 signals Gow abundance growth factor molecules, for example), as well as the synthesis of many weak 
signals into an aIl-or-^lothing response. Phosphatases, then, are essential in determining the extent of 
phosphorylation in the cell and, together with kinases, regulate key cellular processes such as 
metabolic enzyme activity, proliferation, cell growth and differentiation, cell adhesion, and cell cycle 
progression. 

25 Protein phosphatases are generally characterized as either serine/threonine- or tyrosine- 

specific based on thek preferred phospho-amino acid substrate. However, some phosphatases (DSPs, 
for dual specificity phosphatases) can act on phosphorylated tyrosine, serine, or threonine residues. 
The protein serine/threonine phosphatases (PSPs) are important regulators of many cAMP-mediated 
hormone responses in cells. Protein tyrosine phosphatases (PTPs) play a significant role in cell cycle 

30 and cell signaling processes. Another family of phosphatases is the acid phosphatase or histidine acid 
phosphatase (HAP) family whose members hydrolyze phosphate esters at acidic pH conditions. 

PSPs are found in the cytosol, nucleus, and mitochondria and in association with cytoskeletal 
and membranous structures in most tissues, especially the brain. Some PSPs require divalent cations, 
such as Ca^"^ or Mn^^, for activity. PSPs play important roles in glycogen metabolism, muscle 

35 contraction, protem synthesis, T cell function, neuronal activity, oocyte maturation, and hepatic 

1 
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metaboUsm (reviewed in Cohen, P. (1989) Annu. Rev. Biochem. 58:453-508). PSPs can be separated 
into two classes. The PPP class includes PPl, PP2A, PP2B/calcmeurin, PP4, PP5, PP6, and PP7. 
Members of this class are composed of a homologous catalytic subunit bearing a very highly 
conserved signature sequence, coupled with one or more regulatory subunits (PROSITE 

5 PDOCOOl 15). Further interactions with scaffold and anchoring molecules determine the intracellular 
localization of PSPs and substrate specificity. The PPM class consists of several closely related 
isoforms of PP2C and is evolutionarily unrelated to the PPP class. 

PPl dephosphorylates many of the protems phosphorylated by cyclic AMP^ependent protein 
kinase (PKA) and is an important regulator of many cAMP-mediated hormone responses in cells. A 

10 number of isoforms have been identified, with the alpha and beta forms bemg produced by alternative 
splicmg of the same gene. Both ubiquitous and tissue-specific targeting protems for PPl have been 
identified. In the brain, inhihition of PPl activity by the dopamme and adenosine 3',5 - 
monophosphate-regulated phosphoprotein of 32kDa (D ARPP-32) is necessary for normal dopamine 
response m neostriatal neurons (reviewed m Price. N£. and M.C Mumby (1999) Curr. 0pm. 

15 Neurobiol. 9:336-342). PPl. along with PP2A, has been shown to linut motility ia microvascular 
endothelial cells, suggesting a role for PSPs in the mhibition of angiogenesis (Gabel. S. et al. (1999) 
Otolaryngol. Head Neck Surg.l21:463-468). 

PP2A is the mam serine/threonme phosphatase. The core PP2A enzyme consists of a single 
36 kDa catalytic subunit (C) associated with a 65 kDa scaffold subunit (A), whose role is to recruit 

20 additional regulatory subunits (B). Three gene families encoding B subunits are known (PR55, PR61, 
and PR72), each of which contam multiple isoforms, and additional families may exist (Millward, 
T A et al. (1999) Trends Biosci 24: 186-191). These ^B-type" subunits are cell type- and tissue- 
specific and determine the substrate specificity, enzymatic activity, and subcellular localization of the 
holoenzyme. The PR55 family is highly conserved and bears a conserved motif (PROSITE 

25 PDCK300785). PR55 iacreases PP2A activity toward mitogen-activated protein kinase (MAPK) and 
MAPK kinase (MEK). PP2A dephosphorylates the MAPK active site, inhibiting the cell's entry into 
mitosis. Several proteins can compete with PR55 for PP2A core enzyme binding, mcludmg the CKII 
kmase catalytic subunit, polyomavkus middle and small T antigens, and SV40 small t antigen. 
Viruses may use this mechanism to commandeer PP2A and stimulate progression of the cell through 

30 the cell cycle (Pallas, D.C. et al. (1992) J. Virol. 66:886-893). Altered MAP kinase expression is also 
implicated m a variety of disease conditions including cancer, inflammation, immune disorders, and 
disorders affecting growtii and development. PP2A, in fact, can dephosphorylate and modulate the 
activities of more tiian 30 protein kinases in vitro, and other evidence suggests that the same is true in 
vivo for such kinases as PKB, PKC, the cahnodulin-dependent kinases, ERK family MAP kinases, 

35 cyclin-dependent kmases, and the IkB kmases (reviewed in Millward et al., supra). PP2A is itself a 
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substrate for OKI and CKII kinases, and can be stimulated by polycationic macromolecules. A PP2A- 
like phosphatase is necessary to maintain the Gl phase destraction of mammalian cyclins A and B 
(Bastians, H. et ai. (1999) Mol. Biol. CeU 10:3927-3941). PP2A is a major activity in the brain and is 
implicated in regulating neurofilament stability and normal neural function, particularly the 
5 phosphorylation of the microtubule-associated protein tau. Hyperphosphorylation of tau has been 
proposed to lead to the neuronal degeneration seen in Alzheimer's disease (reviewed in Price and 
Mumby, supra) . 

PP2B, or calcmeurin, is a Ca^-activated dimeric phosphatase and is particularly abundant hi 
the brain. It consists of catalytic and regulatory subunits, and is activated by the binding of the 

10 calcium/cahnodulin complex. Calcineurin is the target of the immunosuppresant drugs cyclosporine 
and EK506. Along witii other cellular factors, fliese drugs interact wifli calcineurin and iiiiibit 
phosphatase activity. In T cells, this blocks the calcium dependent activation of the NF-AT femily of 
transcription factors, leading to immunosuppression. This family is widely distributed, and it is litely 
that calcineurin regulates gene expression in otiier tissues as well. Li neurons, calcineurin modulates 

15 functions which range from the inhibition of neurotransmitter release to desensitization of 

postsynaptic NMDA-receptor coupled calcium channels to long term memory (reviewed in Price and 
Mumby. supra) . 

Oflier members of the PPP class have recentiy been identified (Cohen, P.T. (1997) Trends 
Biochem. Sci. 22:245-251). One of them. PP5, contains regulatory domains witii tetratricopeptide 
20 repeats. It can be activated by polyunsaturated fatty acids and anionic phospholipids in vitro and 
appears to be involved in a number of signaling patiiways, including those controlled by atrial 
natriuretic peptide or steroid hormones (reviewed in Andreeva, A.V. and M.A. Kutuzov (1999) Cell 
Signal. 11:555-562). 

PP2C is a -42kDa monomer with broad substrate specificity and is dependent on divalent 
25 cations (mainly Mn^^ or Mg^*) for its activity. PP2C proteins share a conserved N-terminal region 
with an invariant DGH motif, which contains an aspartate residue involved in cation binding 
(PROSITE PDOC00792). Targeting proteins and mechanisms regulating PP2C activity have not 
been identified. PP2C has been shown to inhibit tiie stress-responsive p38 and Jun kinase (JNK) 
patiiways (Takekawa, M. et al. (1998) EMBO J, 17:4744-4752). 
30 In contrast to PSPs, tyrosine-specific phosphatases (PTPs) are generally monomeric protems 

of very diverse size (from 20kDa to greater than lOOkDa) and structure that function primarily in the 
transduction of signals across tiie plasma membrane. PTPs are categorized as either soluble 
phosphatases or transmembrane receptor protems that contain a phosphatase domahi. All PTPs share 
a conserved catalytic domam of about 300 amino acids which contains the active site. The active site 
35 consensus sequence includes a cysteme residue which executes a nucleophilic attack on the phosphate 

3 



wo 02/10363 



PCT/USOl/23716 



moiety during catalysis (Neel, B.G. and NX Tonks (1997) Curr. Opin. Cell Biol. 9: 193-204). 
Receptor FTPs are made up of an N-teimmal extracellular domain of variable lengfli, a 
transmemhiane region, and a cytoplasmic region that generally contains two copies of the catalytic 
domain. Although only the first copy seems to have enzymatic activity, the second copy apparently 

5 affects the substrate specificity of the first. The extracellular domains of some receptor PTPs contain 
fibronectin-like rqpeats, immuno^obulin-like domains, MAM domains (an extracellular motif likely 
to have an adhesive function), or carbonic anhydrase-like domains (PROSITE PDOC 00323). This 
wide variety of structural motifs accounts for the diversity in size and specificity of PTPs. 

PTPs play important roles in biological processes such as cell adhesion, lyn?)hocyte 

10 activation, and cell proliferation. PTPs and k are involved in cell-cell contacts, perhaps regulating 
cadherin/catenin function. A number of PTPs affect cell spreading, focal adhesions, and cell motility, 
most of them via the integrin/tyrosine kinase signaHng pathway (reviewed in Neel and Tonks, suj^a)- 
CD45 phosphatases regulate signal transduction and lyn^hocyte activation CLedbetter, J A. et al. 
(1988) Proc. Natl. Acad. Sci. USA 85:8628-8632). Soluble PTPs containmg Src-homology-2 

15 domains have been identified (SHPs), suggesting that these molecules might interact with receptor 
tyrosine kinases. SHP-1 regulates cytokine receptor signaling by controlling tiie Janus family PTKs 
in hematopoietic cells, as well as signaling by die T-cell recqptor and c-Kit (reviewed in Neel and 
Tonks, supra\ M-phase inducer phosphatase plays a key role in the mduction of mitosis by 
dq)hosphorylating and activating die PTK CDC2. leadmg to cell division (Sadhu, K. et al. (1990) 

20 Proc. Nad. Acad, Sci. USA 87:5139-5143). hi addition, the genes encoding at least eigjit PTPs have 
been mapped to chromosomal regions that are translocated or reairanged in various neoplastic 
conditions, including lynq)homa, small cell lung carcmoma, leukemia, adenocarcinoma, and 
neuroblastoma (reviewed in Charbonneau, H. and N.K. Tonks (1992) Annu. Rev. Cell Biol. 8:463- 
493). The FTP enzyme active site comprises the consensus sequence of the MTMl gene family. The 

25 MTMl gene is responsible for X-linked recessive myotubdar myopathy, a congenital muscle disorder 
tiiat has been linked to Xq28 (KioscMs, P. et al., (1998) Genomics 54:256-266. Many PTKs are 
encoded by oncogenes, and it is well known that oncogenesis is often accompanied by increased 
tyrosine phosphorylation activity. It is therefore possible that PTPs may serve to prevent or reverse 
cell transformation and the growth of various cancers by controlling the levels of tyrosine 

30 phosphorylation in cells. This is supported by studies showing that overexpression of FTP can 
suppress transformation in cells and that specific inhibition of FTP can enhance cell transformation 
(Oiarbonneau and Tonks, supra) . 

Dual specificity phosphatases (DSPs) are structurally more similar to the PTPs tiian the PSPs. 
DSPs bear an extended FTP active site motif with an additional 7 amino acid residues. DSPs are 

35 primarily associated with cell proliferation and include the cell cycle regulators cdc25A, B, and C. 

4 
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The phosphatases DUSPl and DUSP2 inactivate the MAPK fanrily members ERK (extracellular 
signal-regulated kinase), JNK (c-Jun N4eiminal kmase), and p38 on boA tyrosine and threonine 
residues (PROSTTE PDOC 00323, suora) . In the activated state, these kmases have been implicated 
in neuronal differentiation, proliferation, oncogenic transfoimation, platelet aggregation, and 

5 apoptosis. Thus, DSPs are necessary for proper regulation of these processes (Muda, M. et al. (1996) 
J. Biol. C3iem. 271:27205-27208). The tumor suppressor PTEN is a DSP that also shows Upid 
phosphatase activity. It seems to negatively regulate interactions with the extracellular matrix and 
maintains sensitivity to apoptosis. PTEN has been implicated in the prevention of augiogenesis (Giri, 
D. and M. Ittmann (1999) Hum. Pathol. 30:419-424) and abnormalities in its e3q>ression are 

10 associated wifli numerous cancers (reviewed in Tamura, M. et al. (1999) J. Natl. Cancer Inst. 
91:1820-1828). 

Histidine acid phosphatase (HAP; EXPASY EC 3.1.3.2), also known as acid phosphatase, 
hydrolyzes a wide spectrum of substrates including alkyl, aryl, and acyl orthqphosphate monoesters 
and phosphorylated protems at low pH. HAPs share two regions of conserved sequences, each 

15 centered around a histidine residue which is involved in catalytic activity. Members of the HAP 
family include lysosomal acid phosphatase (LAP) and prostatic acid phosphatase (PAP), both 
sensitive to inhibition by Lrtartrate (PROSITE PDOC00538). 

LAP, an orthophosphoric monoester of the endosomal/iysosomal coriipartment is a 
housekeepmg gene whose enzymatic activity has been detected in all tissues examined (Geier. C. et 

20 al. (1989) Eur. J. Biochem. 183:61 1-616). LAP-deficient mice have progressive skeletal disorder and 
an increased disposition toward generalized seizures (Saftig, P. et al. (1997) J. Biol. Chem. 
272:18628-18635). LAP-deficient patients were found to have the following clinical features: 
intermittent vomitiaig, hypotonia, lethargy, opisthotonos, terminal bleedrag, seizures, and death in 
early infancy (Online Mendelian Inheritance in Man (OMM) *200950). 

25 PAP, a prostate epithelium-specific differentiation antigen produced by the prostate gland, 

has been used to diagnose and stage prostate cancer. In prostate carcinomas, the enzymatic activity of 
PAP was shown to be decreased compared with normal or benign prostate hypertrophy cells (Foti, 
A.G. et al. (1977) Cancer Res. 37:4120-4124). Two forms of PAP have been identified, secreted and 
intracellular. Mature secreted PAP is detected in the seminal fluid and is active as a glycosylated 

30 homodimer witii a molecular weight of approximately 100-kilodalton. Intracellular PAP is found to 
exhibit endogenous phosphotyrosyl protein* phosphatase activity and is involved in regulating prostate 
cell growtii (Meng, T.C. and M.F. Lin (1998) J. Biol. Chem. 34:22096-22104). 

Synaptojanin, a polyphosphoinositide phosphatase, dephosphorylates phosphoinositides at 
positions 3, 4 and 5 of the mositol ring. Synaptojanin is a major presynaptic protem found at clathrin- 

35 . coated endocytic intermediates in nerve terminals, and binds the clathrin coat-associated protein. 
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EPS15, which is mediated by the C^eminal region of synatojaiiin-170, which has 3 Asp-Pio-Phe 
amino acid repeats. Further, this 3 residue repeat had been found to be the binding site for the EH 
domains of EPS15 (Haf&ier, C et al. (1997) FEBS Lett. 419:175-180). AdditionaUy, synaptojamn 
may potentially regulate interactions of endocytic proteins with the plasma membrane, and be 

5 involved m synaptic vesicle recycling (Brodin, L. et al. (2000) Curr. 0pm. Neurobiol. 10:3 12-320). 
Studies in mice with a targeted disruption in the synaptojamn 1 gene (Synjl) were shown to support 
coat formation of endocytic vesicles more effectively than was seen in wild-type mice, suggesting that 
Synj 1 can act as a negative regulator of membrane-coat protein interactions. These findings provide 
genetic evidence for a crucial role of phosphomositide metabolism in synaptic vesicle recycling 

10 (Cremona, O.etal. (1999) CeU 99:179-188). 

The discovery of new protein phosphatases, and the polynucleotides encoding them, satisfies 
a need in the art by providing new compositions which are usefiil in the diagnosis, prevention, and 
treatment of immune system disorders, neurological disorders, developmental disorders, and cell 
proliferative disorders, and in the assessment of the effects of exogenous con^unds on the 

15 expression of nucleic add and amino acid sequences of protein phosphatases. 

SUMMARY OF THE INVENTION 

The invention features purified polypeptides, protein phosphatases, referred to collectively as 
•TF' and mdividually as 'TF-l," "PP-2," *TP-3," "TP^," 'TP-5," 'TP A" 'TP-7," 'TP-8 » "PP-9," 

20 and 'TP-IO." In one aspect, the invention provides an isolated polypeptide selected firom the group 
consisting of a) a polypeptide comprising an amino acid sequence selected firom the group consisting 
of SEQ ID NO: 1-10, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 

25 consisting of SEQ ID NO: 1-10, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting^ of SEQ ID NO: 1-10. In one alternative, the 
invention provides an isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 1-10. 

The invention further provides an isolated polynucleotide encoding a polypeptide selected 
from the group consisting of a) a polypeptide comprismg an amino acid sequence selected from the 

30 group consistmg of SEQ ID NO: 1-10, b) a polypeptide conq)rising a naturally occurring amino acid 
sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-10, c) a biologically active fragment of a polypeptide having an ammo acid sequence 
selected from the group consisting of SEQ ID NO:1-10, and d) an immunogenic fragment of a 
polypeptide having an amino acid sequence selected fix)m the group consisting of SEQ ID NO: 1-10. 

35 In one alternative, the polynucleotide encodes a polypeptide selected from the group consisting of 

6 
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SEQ E) NO: 1-10. In another alternative, the polynucleotide is selected ftom the group consisting of 
SEQIDNO:ll-2a 

Additionally, the mvention provides a recombinant polynucleotide conq>rising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 

5 consisting of a) a polypeptide conq)rismg an ammo acid sequence selected from the group consisting 
of SEQ ID NO: 1-10, b) a polypeptide conq)rising a naturally occurring amino acid sequence at least 
90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, c) a 
biologically active fragment of a polypeptide having an ammo acid sequence selected from the group 
consistmg of SEQ ID NO: MO, and d) an unmunogenic fragment of a polypeptide having an amino 

10 acid sequence selected from the group consisting of SEQ ID NO: 1-10. In one alternative, the 

iDvention provides a ceD transfonned with the recombinant polynucleotide. lii another alternative, the 
invention provides a transgenic organism comprising the recombinant polynucleotide. 

The invention also provides a method for producing a polypeptide selected from the group 
consisting of a) a polypeptide conq)rising an amino acid sequence selected from the group consistmg 

15 of SEQ ID NO: 1-10, b) a polypeptide comprising a naturally occurring ammo acid sequence at least 
90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-10, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-10. The method comprises a) 

20 culturing a cell under conditions suitable for expression of the polypeptide, wherein said cell is 

transformed with a recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide encoding the polypeptide, and b), recovering the polypeptide so expressed. 

Additionally, the mvention provides an isolated antibody which specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide comprising an ammo acid 

25 sequence selected from the group consistmg of SEQ ID Np:l-lb, b) a polypeptide comprismg a 
naturally occurring amino acid sequence at least 90% identical to an ammo acid sequence selected 
from the group consisting of SEQ ID NO:M0, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, and d) an 
immunogenic fragment of a polypeptide havmg an ammo acid sequence selected from the group 

30 consisting of SEQ ID NO: 1-10. 

The invention further provides an isolated polynucleotide selected from die group consistmg 
of a) a polynucleotide comprismg a polynucleotide sequence selected from the group consisting of 
SEQ ID NO: 1 1-20, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 

35 NO: 1 1-20, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
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conaplementaiy to the polynucleotide of b), and e) an RNA equivalent of a>d). Ja one alternative, the 
polynucleotide coniprises at least 60 contiguous nucleotides. 

Additionally, the invention provides a method for detecting a target polynucleotide in a 
sample, said target polynucleotide having a sequence of a polynucleotide selected from the groi^ 

5 consisting of a) a polynucleotide comprising a polynucleotide sequence selected from tiie group 
consisting of SEQ ID NO: 1 1-20, b) a polynucleotide conpismg a naturally occurring polynucleotide 
sequence at least 90% identical to a polynucleotide sequence selected firom tiie group consisting of 
SEQ ID NO: 1 1-20, c) a polynucleotide conq)lementary to flie polynucleotide of a), d) a 
polynucleotide complementary to tiie polynucleotide of b). and e) an FNA equivalent of a)-d). The 

10 metiiod comprises a) hybridizmg tiie sample with a probe comprismg at least 20 contiguous 

nucleotides comprising a sequence complementary to said target polynucleotide in tiie sample, and 
which probe specificaUy hybridizes to said target polynucleotide, under conditions whereby a 
hybridization complex is fonned between said probe and said target polynucleotide or fragments 
thereof, and b) detecting tiie presence or absence of said hybridization complex, and optionally, if 

15 present, die amount fliereof. hi one alternative, tiie probe conqirises at least 60 contiguous 

nucleotides. 

The invention further provides a mefliod for detecting a target polynucleotide m a sample, 
said target polynucleotide having a sequence of a polynucleotide selected from the group consisting 
of a) a polynucleotide comprising a polynucleotide sequence selected from tiie group consisting of 

20 SEQ ID NO: 1 1-20, b) a polynucleotide conq)rising a naturaUy occurring polynucleotide sequence at 
least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO: 1 1-20, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to tiie polynucleotide of b), and e) an RNA equivalent of a)-d). The metiiod 
comprises a) ampUfying said target polynucleotide or fragment tiiereof using polymerase chain 

25 reaction amplification, and b) detecting tiie presence or absence of said anqilified target 
polynucleotide or fragment tiiereof. and. optionaUy. if present, tiie amount tiiereof. 

The invention further provides a composition comprising an effective amount of a 
polypeptide selected from tiie group consisting of a) a polypeptide conqirising an amino acid 
sequence selected from tiie group consisting of SEQ ID NO:1-10, b) a polypeptide comprismg a 

30 naturaUy occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from tiie group consisting of SEQ ID NO:1-10. c) a biologicaUy active fragment of a polypeptide 
having an ammo acid sequence selected from tiie group consisting of SEQ ID NO:1-10, and d) an 
immunogenic fragment of a polypeptide havmg an amino acid sequrace selected from flie groiqi 
consistmg of SEQ ID NO:1-10, and a pharmaceutically acceptable excipirait. In one embodunent, die 

35 conqiosition conyrises an ammo acid sequence selected from flie group consisting of SEQ ID NO: 1- 



wo 02/10363 



PCTAJSOl/23716 



10. The invention additionally provides a method of treating a disease or condidon associated with 
decreased expression of fdnctional PP, conq)rising administering to a patient in need of such 
treatment the conq)Osition. 

The invention also provides a method for screening a compound for effectiveness as an 

5 agonist of a polypeptide selected firom the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO:1-10, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-10, c) a biologically active fragment of a polypeptide 
having an ammo acid sequence selected from the group consisting of SEQ ID NO: 1-10, and d) an 

10 immunogenic fragmmt of a polypeptide having an amino acid sequence selected from the group 
consistmg of SEQ ID NO: 1-10. The method comprises a) exposing a sample comprismg the 
polypeptide to a compound, and b) detecting agonist activity in the sanq>le. Jn one alternative, the 
invention provides a composition corqpiising an agonist compound identified by the method and a 
pharmaceutically acceptable excipient In another alternative, the invention provides a method of 

IS treating a disease or condition associated with decreased expression of functional PP, comprisiag 
administering to a patient in need of such treatment the composition. 

Additionally, the inveation provides a method for screening a compound for effectiveness as 
an antagonist of a polypeptide selected from the group consisting of a) a polypeptide conoprising an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, b) a polypeptide 

20 conoprising a naturally occurring amino acid sequence at least 90% identical to an amino acid 

sequence selected from the group consisting of SEQ ID NO: 1-10, c) a biologically active fragment of 
a polypeptide having an amino acid sequence selected from the groiq> consisting of SEQ ID NO: 1-10, 
and d) an immunogenic fragment of a polypeptide having an anuno acid sequence selected from the 
group consisting of SEQ ID NO: 1-10. The method comprises a) exposing a sample comprising the 

25 polypeptide to a compound, and b) detecting antagonist activity in the sample. In one alternative, the 
invention provides a conq)Osition comprising an antagonist conq)ound identified by the method and a 
pharmaceutically acceptable excipient. Iq another alternative, the invention provides a method of 
treating a disease or condition associated with overexpression of functional PP, comprising 
administering to a patient in need of such treatment the composition. 

30 The invention further provides a method of screening for a compound that specifically binds 

to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-10, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-10, c) a biologically active fragment of a polypeptide 

9 
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having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-10. The method comprises a) combining the polypeptide with at least 
one test comporaid under suitable conditions, and b) detectmg bmding of the polypeptide to the test 

5 compound, thereby identifying a compound that specifically binds to the polypeptide. 

The invention further provides a method of screening for a compound that modulates the 
activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-10, b) a polypeptide comprising a 
naturally occuning ammo acid sequence at least 90% identical to an amino add sequence selected 

10 from the group consistmg of SEQ ID NO: 1-10, c) a biologically active fragment of a polypeptide 
havmg an amino acid sequence selected from the group consisting of SEQ ID NO: 1-10, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-10. The method comprises a) combining the polypeptide with at least 
one test compound under conditions permissive for the activity of the polypeptide, b) assessing the 

15 activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the 
polypeptide in the presence of the test compound with the activity of the polypeptide in the absence 
of the test compound, wherein a change in the activity of the polypeptide in the presence of the test 
compound is indicative of a compound that modulates the activity of the polypeptide. 

The invention furthCT provides a method for screening a compound for effectiveness in 

20 altering e3q)ressioii of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consistmg of SEQ ID NO: 11-20, the method 
con^>rising a) exposing a sample comprising the target polynucleotide to a conq>ound, and b) 
detectmg altered expression of the target polynucleotide. 

The invention further provides a method for assessing toxicity of a test compound, said 

25 method comprising a) treatmg a biological sample contaming nucleic acids with the test compound; 
b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 1-20, ii) a 
polynucleotide comprising a naturally occxming polynucleotide sequence at least 90% identical to a 

30 polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 1 -20, iii) a 

polynucleotide having a sequence complementary to i), iv) a polynucleotide complementary to the 
polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs under conditions 
whereby a specific hybridization complex is formed between said probe and a target polynucleotide 
in the biological sample, said target polynucleotide selected from the group consisting of i) a 
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polynucleotide comprising a polynucleotide sequence selected ftom the group consisting of SEQ ID 
NO: 1 1-20, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:11-20, 
iii) a polynucleotide complementary to the polynucleotide of i), iv) a polynucleotide conq>lementaiy 

5 to tiie polynucleotide of ii), and v) an RNA equivalent of iyiv). Altematively . the target 

polynucleotide conoprises a fragment of a polynucleotide sequence selected from the group consisting 
of i>v) above; c) quantifying the amount of hybridization complex; and d) comparing the amount of 
hybridization complex in the treated biological sanq)Ie with the amount of hybridization con?)lex in 
an untreated biological sanqple, wherein a difference m die amount of hybridization conq)lex in the 

10 treated biological sample is indicative of toxicity of the test compound. 

BRIEF DESCiaPTION OF THE TABLES 
Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
sequences of the present invention. 
15 Table 2 shows the GenBank identification number and annotation of the nearest GenBank 

homolog for polypeptides of flie mvention. The probability score for tiie match between each 
polypeptide and its GenBank homolog is also shown. 

Table 3 shows structural features of polypqrtide sequences of the invention, including 
predicted motifs and domams, along wifli tiie metiiods, algoritiuns, and searchable databases used for 
20 analysis of the polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
polynucleotide sequences of die invention, along wifli selected fragments of tiie polynucleotide 
sequences. 

Table 5 shows the representative cDNA library for polynucleotides of the invention. 
25 Table 6 provides an appendix which describes tiae tissues and vectors used for constmction of 

the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algoritiuns used to analyze tiie polynucleotides and 
polypeptides of the invention, along with applicable descriptions, references, and threshold 
parameters. 

30 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleotide sequences, and methods are described, it is understood 
that this mvention is not lunited to the particular machines, materials and methods described, as these 
may vary. It is also to be understood that tiie terminology used herein is for the purpose of describing 
35 particular embodiments only, and is not intended to limit the scope of the present invention which 

11 
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will be limited only by the appended claims. 

It must be noted that as used herein and in the appended claims, the singular forms "a,", "an," 
and *the" include plural reference unless the context clearly dictates oth^ise. Thus, for exan^le, a 
reference to "a host cell" includes a plurality of such host cells, and a reference to "an antibody" is a 
5 reference to one or more antibodies and equivalents thereof Icnown to those skilled in the art, and so 
forth. 

Unless defined otherwise, all technical and scientific terms used herem have the same 
meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. 
Although any machmes, materials, and methods snnilar or equivalent to those described h«em can be 

10 used to practice or test the present invention, the preferred machines, materials and methods are now 
described. All publications mentioned herein are cited for die purpose of describing and disclosing 
the cell lines, protocols, reagents and vectors which are reported in the publications and which might 
be used in connection with the mvention. Nothing herein is to be construed as an admission that die 
invention is not entitled to antedate such disclosure by virtue of prior invention. 

15 DEFINITIONS 

*TP" refers to the amino acid sequences of substantially purified PP obtained firom any 
species, particularly a mammalian species, including bovine, ovme, porcine, murine, equme, and 
human, and firom any source, whether natural, synthetic, semi-synthetic, or recombinant 

The term "agonist" refers to a molecule which intensifies or mimics die biological activity of 

20 PP. Agonists may include protems, nucleic acids, carbohydrates, small molecules, or any other 

compound or con5)Osition which modulates the activity of PP eitiier by directly interacting with PP or 
by acting on components of the biological pathway in which PP participates. 

An "allelic variant" is an alternative form of the gene encoding PP. Allelic variants may 
result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 

25 polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination with the others, one or more times 
in a given sequence. 

30 "Altered" nucleic acid sequences encoding PP include those sequences with deletions, 

insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as PP or a 
polypeptide with at least one functional characteristic of PP. Included widiin this definition are 
polymorphisms which may or may not be readUy detectable using a particular oligonucleotide probe 
of the polynucleotide encoding PP, and improper or unexpected hybridization to allelic variants, with 

35 a locus other than the normal chromosomal locus for the polynucleotide sequence encoding PP. The 
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encoded protein may also be "altered," and may contain deletions, insertions, or substitutions of 
amino acid residues which produce a silent change and result in a functionally equivalent PP. 
Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, 
solubility, hydrophobiciiy, hydrophilicity, and/or the airphipathic nature of the residues, as long as 

5 the biological or immunological activity of PP is retained. For example, negatively charged amino 
acids naay mclude aspartic acid and glutamic acid, and positively charged amino acids may include 
lysine and arginine. Amino acids with uncharged polar side chams having similar hydrophilicity 
values may include: asparagine and glutamine; and serine and threonine. Amino acids with 
uncharged side chains havmg similar hydrophilicity values may include: leucine, isoleucine, and 

10 valine; glycine and alanine; and phenylalanine and tyrosine. 

The terms "amino acid" and "amino acid sequence" refw to an oligopeptide, peptide, 
polypeptide, or protein sequence, or a fragment of any of these, and to naturally occurring or synthetic 
molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally occurring 
protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid 

15 sequence to the conqplete native amino acid sequence associated with the recited protein molecule. 

"AiDplification" relates to the production of additional copies of a nucleic acid sequence. 
Aiiq)lification is generally carried out usmg polymerase chain reaction (PGR) technologies well 
known in the art 

The term "antagonist" refers to a molecule which inhibits or attenuates the biological activity 
20 of PP. Antagonists may include proteins such as antibodies, nucleic acids, carbohydrates, small 
molecules, or any other compound or composition which modulates the activity of PP either by 
directly interacting with PP or by acting on conq)onents of the biological pathway in which PP 
participates. 

The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 
25 thereof, such as Fab, F(ab*)2» an^ fragments, which are capable of binding an epitopic determinant 
Antibodies that bind PP polypeptides can be prepared using intact polypeptides or using fragments 
containing small peptides of interest as the immunizing antigen. The polypeptide or oligopeptide 
used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of 
RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly 
30 used carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, 
and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal. 

The term "antigenic determmanf ' refers to that region of a molecule (i.e., an epitope) that 
makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
35 which bind specifically to antigenic determinants particular regions or three-dimensional stmctuies 
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on the protein). An antigenic detenninant may compete with the intact antigen (i.e., the immunogen 
used to elicit the unmune response) for binding to an antibody. 

The term "antisense" refers to any composition capable of base-pairing with the "sense" 
(coduig) strand of a specific nucleic acid sequence. Antisense compositions may include DNA; 

5 RNA; peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 
phosphorothioates, methylphpsphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2'-methoxyethyl sugars or 2*-methoxyethoxy sugars; or oligonucleotides having 
modified bases such as 5-methyl cytosine, 2*-deoxyuracil, or 7-deaza-2'-<ieoxyguanosine. Antisense 
coolecules may be produced by any method including chemical synthesis or transcription. Once 

10 introduced into a cell, the complementary antisense molecule base-pairs with a naturally occurring 
nucleic acid sequence produced by the cell to form diqilexes which block either transcription or 
translation. The designation '"negative" or **n!inus" can refer to the antisense strand, and the 
designation '"positive" or "plus" can refer to the sense strand of a reference DNA molecule. 

The term '^biologically active" refers to a protein having structural, regulatory, or biochraiical 

15 functions of a naturally occurring molecule. Likewise, "immunologically active" or "immunogenic" 
refers to ttie capability of the natural, recombmant, or synthetic PP, or of any oligopeptide thereof, to 
induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 

"Conqilementary" describes the relationship between two single-stranded nucleic acid 
20 sequences that anneal by base-pairing. For example, 5'-AGT-3* pairs with its complement, 
3'-TCA-5\ 

A "con^osition comprising a given polynucleotide sequence" and a "composition comprising 
a given amino acid sequence" refer broadly to any composition containing the given polynucleotide 
or amino acid sequence. The composition may conqprise a dry formulation or an aqueous solution. 

25 Compositions con5)rising polynucleotide sequences encoding PP or fragments of PP may be 
employed as hybridization probes. The probes may be stored in freeze-dried form and may be 
associated witii a stabilizing agent such as a carbohydrate. In hybridizations, tiie probe may be 
deployed in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., sodium dodecyl 
sulfate; SDS), and other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.). 

30 "Consensus sequence" refers to a nucleic acid sequence which has been subjected to repeated 

DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Biosystems, Foster City CA) in the 5' and/or the 3* direction, and resequenced, or which has been 
assembled from one or more overlapping cDNA, EST, or genomic DNA firagments using a conq)uter 
program for fi:agnient assembly, such as the GELVIEW fragment assembly system (GCG, Madison 
. 35 WI) or Phrap (University of Washington, Seattle WA). Some sequences have been both extended and 
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assembled to produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that aie predicted to least 
interfere with the properties of the original protem, i.e., the structure and especially the function of 
the protein is conserved and not significantiy changed by such substitutions. The table below shows 
5 amino acids which may be substituted for an original amino acid in a protem and which are regarded 





siR rnnfiPTvsiiVA amino acid substitutions. 

Orieinal Residue 


Conservative Substitution 




Ala 


Gly, Ser 




Arg 


His, Lys 


10 


Asn 


Asp, Gbi, His 




Asp 


Asn, Glu 




Cys 


Ala, Ser 




Gin 


Asn, Glu, IBs 




Glu 


Asp, Gin, His 


15 


Gly 


Ala 




His 


Asn, Arg, Gin, Glu 




He 


Leu, Val 




Leu 


Ile,Val 




Lys 


Arg, Gin. Glu 


20 


Met 


Leu, ne 




Phe 


His, Met, Leu, Trp, Tyr 




Ser 


Cys,Thr 




Hir 


Ser, Val 




Tip 


Phe, Tyr 


25 


Tyr 


His, Phe, Trp 




Val 


Ile,Leu,Thr 



Conservative amino add substitutions generally maintam (a) flie structure of tiie polypeptide 
backbone in the area of the substitution, for exanq>le, as a beta sheet or alpha helical conformation, 
30 (b) flie charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of 
die side chain. 

A "deletion" refers to a chan^ in the ammo acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide, 

35 Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which 
retains at least one biological or immunological function of the natural molecule. A derivative 
polypeptide is one modified by glycosylation, pegylation, or any similar process that retains at least 
one biological or immunological function of the polypeptide from which it was derived. 

40 A "detectable label" refers to a reporter molecule or enzyme that is capable of generatmg a 

measurable signal and is covalentiy or noncovalentiy joined to a polynucleotide or polypeptide. 
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•TOifferential expression" refers to increased or upregulated; or decreased, downregulated. or 
absent gene or protein expression, determined by con^aring at least two different sanqples. Such 
conqiarisons may be carried out between, for example, a treated and an untreated sanqjle, or a 
diseased and a nonnal saiiq>le. 

5 •'Exon shuffling" refers to the recombination of different coding regions (exons). Since an 

exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A •'fragment" is a unique portion of PP or the polynucleotide encodmg PP which is identical 

10 in sequence to but shorter in length than the parent sequence. A fragment may conq>rise up to the 
entire length of the defined sequence, minus one nucleotide/amino acid residue. For exan^le, a 
fragment may comprise from 5 to 1000 contiguous nucleotides or amino acid residues. A fragment 
used as a probe, primer, antigen, flierapeutic molecule, or for other purposes, may be at least 5, 10, 
15, 16, 20, 25, 30, 40, 50, 60. 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino acid 

15 residues in lengfli. Fragmentsmay be preferentially selected fix)m certain regions of a molecule. For 
exan?>le, a polypeptide fragment may con5)rise a certain lengfli of contiguous amino acids selected 
from ttie first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certam 
defined sequence. Clearly these lengths are exeii^)lary, and any length that is supported by the 
specification, mcluding the Sequence Listing, tables, and figures, may be encompassed by the present 

20 embodiments. 

A fragment of SEQ ID NO: 1 1-20 comprises a region of unique polynucleotide sequence that 
specifically identifies SEQ ID NO: 1 1-20, for exan^le, as distinct from any other sequence in tiie 
genome from which die fragment was obtained. A fragment of SEQ ID NO: 1 1-20 is useful, for 
exan^le, in hybridization and amplification technologies and in analogous methods that distinguish 

25 SEQ ID NO: 1 1-20 from related polynucleotide sequences. The precise length of a fragment of SEQ 
ID NO: 11-20 and the region of SEQ ID NO: 11-20 to which the fragment corresponds are routinely 
determinable by one of ordinary skill in the art based on tiie mtended purpose for tiie fragment. 

A fragment of SEQ ID NO: 1-10 is encoded by a fragment of SEQ ID NO: 1 1-20, A fragment 
of SEQ ID NO: 1-10 comprises a region of unique amino acid sequence that specifically identifies 

30 SEQ ID NO: 1-10. For example, a fragment of SEQ ID NO: 1-10 is useful as an immunogenic peptide 
for die development of antibodies that specifically recognize SEQ ID NO: 1-10. The precise length of 
a fragment of SEQ ID NO: MO and the region of SEQ ID NO: 1-10 to which tiie fi:agment 
corresponds are routinely determinable by one of ordmaiy skill in the art based on the intended 
purpose for the fragment. 

35 A "ftill lengtii" polynucleotide sequence is one containing at least a translation initiation 
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codon (e.g., methionine) followed by an open leading frame and a translation termination codon. A 
•full length" polynucleotide sequence encodes a 'full length" polypeptide sequence. 

•Homology" refers to sequence similarity or, interchangeably, sequence identity, between 
two or more polynucleotide sequences or two or more polypeptide sequences. 

5 The terms 'percent identity" and "% identity." as applied to polynucleotide sequences, refer 

to the parentage of residue matches between at least two polynucleotide sequmces aligned using a 
standardized algorithmu Such an algorithm may insert, in a standardized and reproducible way, gaps 
in the sequences being compared in order to optimize alignment between two sequences, and 
therefore achieve a more meaningful con5)arison of the two sequences. 

10 Percent identity between polynucleotide sequences may be determmed usmg the default 

parameters of the CLUSTAL V algorithm as incorporated into the MEGAUGN version 3. 12e 
sequence alignment program. This program is part of tiie LASERGENE software package, a suite of 
molecular biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is described in 
ffiggms, D.G. and P.M. Sharp (1989) CABIOS 5: 151-153 and in Higgins, D.G. et al. (1992) CABIOS 

15 8:189-191. For pairwise alignments of polynucleotide sequences, the default parameters are set as 
follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The "weighted" residue 
weight table is sdected as the default. Percent identity is reported by (XUSTAL V as tiie "percent 
similarity" between aligned polynucleotide sequences. 

Alternatively, a suite of commonly used and freely available sequence comparison algorithms 

20 is provided by the National Center for Biotechnology Infonnation (NC3BI) Basic Local Alignment 
Search Tool (BLAST) (Altschul, S.R et al. (1990) J. Mol. Biol. 215:403^10), which is available 
from several sources, including the NCBI, Betiiesda, MD, and on tiie Intemet at 
http://www.ncbi.nlm.nih.goY/BLAST/. The BLAST software suite includes various sequence 
analysis programs including "blastn," that is used to align a known polynucleotide sequence with 

25 other polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for direct pakwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.nhn.nih.gov/gor»bl2.html. 
The "BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 
programs are commonly used with gap and other parameters set to default settings. For exanq)le, to 

30 compare two nucleotide sequences, one may use blastn with tiie '•BLAST 2 Sequences" tool V^ion 
2.0. 12 (April-21-2000) set at default parameters. Such default parameters may be, for sample: 
Matrix: BLOSUM62 
Reward for inatch: 1 
Penalty for mismatch: -2 

35 Open Gap: 5 and Extension Gap: 2 penalties 
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Gap X drop-off: 50 
Expect: 10 
Word Size: 11 
Filter: on 

5 Percentidentityimybemeasuredoverthelengthofanentiiedeaiedsequenc^ 

as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of afragment taken ftomalargpr. defined sequence, for instance.afi^^ 
least 20. at least 30. at least 40. at least 50. at least 70, at least 100. or at least 200 cont^^ 
nucleotides. Such lengths are exemplary only, and it is understood that any ftagment length 
10 supported by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to 
describe a length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similarammoacidsequencesduetothedegeneracyofthegeneticcode. It is understood that chaises 
inanucldc acid sequence can be made using this degeneracy to produce multiple nucleic acid 

15 sequences that all encode substantially the same proteuL 

The phrases "percent identity" and "% identity," as appUed to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aUgnedusinga 
standardized algorithm. Methods of polypeptide sequence alignmrat are weU-known. Some 
aUgmnent methods take mto account conservative ammo acid substitutions. Such conservative 
20 substitutions, explained m more detsdl above, generally preserve the charge and hydrophobicity at the 
site of substitution, thus preservmg the structure (and therefore fimction) of the polypeptide. 

Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as mcorporated into the MEGALIGN version 3. 12e 
sequence aUgnment program (described and referenced above). For pairwise aHgnments of 
25 polypeptidesequencesusmgCLUSTALV.thedefaultparametersaresetasfollows: Ktuple=l. gap 
penalty=3, wmdow=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
30 comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 
2.0.12 (April-21-20OO) with blastp set at default parameters. Such default parameters may be, for 

example: 

Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
35 Gap X drop-off: 50 
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Expect: 10 
Word Size: 3 
Filter: on 

Percent identity may be measured over the length of an entire defined polypeptide sequence, 

5 for exaii5)le, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 
exan9)le, over the length of a fragment taken from a larger, defined polypeptide sequence, for 
instance, a fragment of at least 15, at least 20, at least 30, at least 40. at least 50, at least 70 or at least 
150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 

10 used to describe a length over which percentage identity may be measured. 

**Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term "humanized antibody" refers to an antibody molecule in which the amino acid 

15 sequence in the non-antigen binding regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

•Hybridization" refers to the process by which a polynucleotide strand anneals with a 
coiDplemratary strand through base pairing under defined hybridization conditions. Specific 
hybridization is an mdication that two nucleic acid sequences share a high degree of complementarity. 

20 Specific hybridization complexes form under permissive annealmg conditions and remain hybridized 
after ttie "washing" step(s). The washing step(s) is particularly important in determining the 
stringency of the hybridization process, with more stringent conditions allowing less non-specific 
bmding, i.e., binding between pairs of nucleic acid strands that are not perfectiy matched. Permissive 
conditions for annealing of nucleic acid sequences are routinely determinable by one of ordinary skill 

25 in the art and may be consistent among hybridization experiments, whereas wash conditions may be 
varied among experiments to achieve the desired stringency, and tiierefore hybridization specificity. 
Permissive annealing conditions occur, for example, at 68°C in the presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 /xg/ml sheared, denatured sahnon sperm DNA. . 

Generally, stringency of hybridization is expressed, in part, with reference to the ten^erature 

30 under which the wash step is carried out. Such wash temperatures are typically selected to be about 
5^C to 20''C lower than tiie tiiermal melting pomt (T J for die specific sequence at a defined ionic 
strength and pH. The T„ is the temperature (under defined ionic strengfli and pH) at which 50% of 
the target sequence hybridizes to a perfecdy matched probe. An equation for calculating T^ and 
conditions for nucleic acid hybridization are well known and can be found in Sanibrook, J. et al. 

35 (1989) Molecular Cloning: A Laboratory Manual ed., vol. 1-3, Cold Spring Harbor Press, 
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Plainview NY; specifically see volume 2, chapter 9. 

High stringency conditions for hybridization between polynucleotides of the present 
invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, 
for 1 hour. Alternatively, temperatures of about 65^C, eOX/SS^'C, or 42*^C may be used. SSC 
5 concentration may be varied from about 0.1 to 2 x SSC, with SDS being present at about 0.1%. 
TypicaUy, blocking reagents are used to block non-specific hybridization. Such blocking reagents 
include, for instance, sheared and denatured sahnon sperm DNA at about 100-200 fig/ml. Organic 
solvent, such as formamide at a concentration of about 35-50% v/v, may also be used under particular 
circumstances, such as for RNA:DNA hybridizations. Useful variations on these wash conditions 
10 will be readUy apparent to those of ordinary skill in the art. Hybridization, particularly under high 
stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such 
similarity is strongly mdicative of a similar role for the nucleotides and their encoded polypeptides. 

The term *liybridization complex" refers to a conq)lex formed between two nucleic acid 
sequences by virtue of the formation of hydrogen bonds between complementary bases. A 
15 hybridization con^lex may be formed in solution (e.g., Qt or Rot analysis) or formed between one 
nucleic add sequence present in solution and another nucleic acid sequence immobilized on a solid 
support (e.g., paper, membranes, filters, chips, pins or glass slides, or any oflier appropriate substrate 
to which cells or their nucleic acids have been fixed). 

The words "msertion" and "addition" refer to changes m an amino acid or nucleotide 
20 sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 

"Immune response" can refer to conditions associated witii inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and otiier signaling molecules, which may affect 
cellular and systemic defense systems. 
25 An "immunogenic fragment" is a polypeptide or oligopeptide fragment of PP which is 

capable of eliciting an immune response when introduced into a living organism, for example, a 
mammal. The term "immunogenic fragment" also includes any polypeptide or oUgopeptide fragment 
of PP which is useful in any of the antibody production methods disclosed herein or known in die art 
The term "microarray" refers to an arrangement of a plurality of polynucleotides, 
30 polypeptides, or other chemical compounds on a substrate. 

The tenns "element" and "array element" refer to a polynucleotide, polypeptide, or otiier 
chemical conq>ound having a unique and defined position on a microarray. 

The term "^modulate" refers to a change in the activity of PP. For example, modulation may 
cause an mcrease or a decrease in protein activity, binding characteristics, or any otiier biological, 
35 functional, or immunological properties of PP. 
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The phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and niay represent the sense or the 
antisense strand, to peptide nucleic acid (PNA). or to any DNA-lite or RNA-like material. 
5 "Operably linked" refers to the situation in which a first nucleic acid sequence is phiced in a 

functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to acodmg sequMice if the promoter affects the transcription or expression of the codmg 
sequence. Operably linked DNA sequences may be in close proxunity or contiguous and, where 
necessary to join two protein codii^ regions, in the same reading frame. 
10 "P^tide nucleic add" (PNA) refers to an antisense molecule or anti-gene agent which 

comprises an oUgonucleotide of atleast about 5 nucleotides in length linked to a peptide backbone of 
ammo acid residues ending in lysme. The termmal lysme confers solubiUty to the composition. 
PNAs prefeentially bmd con^pleniBntary sm^e stranded DNA or RNA and stop transcript 
elon^on, and may be pegylated to extend their lifespan in the cell. 
15 "Post-translational modification" of an PP may involve Upidation, glycosylation, 

phosphorylation, acetylation, lacemization. proteolytic cleavage, and other modifications known in 
the art These processes may occur synthetically or biochemically. Biochemical modifications will 
vary by cell type depending on the enzymatic milieu of PP. 

"Probe" refers to nucleic acid sequences encodiag PP, tiieir conq)lements, or fragments 
20 thereof, which are used to detect identical, alleUc or related nucleic acid sequences. Probes are 
isolated oUgonucleotides or polynucleotides attached to a detectable label or reporter molecule. 
Typical labels include radioactive isotopes, hgands, chemiluminescent agents, and enzymes. 
"Primers" are short nucleic acids, usually DNA oUgonucleotides, which may be annealed to a target 
polynucleotide by complementary base-pairing. The primer may then be extended along the target 
25 DNA strand by a DNA polymerase enzyme. Primer pairs can be used far amplification (and 
identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PGR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70. 80, 90, 100. 
30 or at least 150 consecutive nucleotides of the disclosed nucleic add sequences. Probes and primers 
may be considerably longer than fliese examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 

Methods for preparing and using probes and primas are described in the refeences, for 
example Sambrook, J. et al. (1989) Molecular aoning: A Laboratory Manual, 2"* ed,, voL 1-3, Cold 
35 Spring Harbor Press, Plainview NY; Ausubel, F.M. et al. (1987) Current Protocols in Molecular 
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Biology, Greene Publ. Assoc. & Wiley-Mersciences, New York NY; Innis, M. et al. (1990) 
Protocols, A Guide to Methods and AppUcations. Academic Press, San Diego CA. PGR primer pairs 
can be derived from a known sequence, for exairq>le, by using computer programs intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cainhridge 
5 MA), 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For exanq)le, OLIGO 4.06 software is usefiil for the selection of PGR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 
5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Sumlar primer 

10 selection programs have incorporated additional features for expanded capabilities. For example, the 
PrimOU primer selection program (available to the public from the Genome Center at University of 
Texas South West Medical Center, Dallas TX) is capable of choosmg specific primers from 
megabase sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS 
primer selection program (available to the pubUc from the Whitehead Institute/MIT Center for 

15 Genome Research, Cambridge MA) allows the user to input a '^mispriming library," in which 

sequence? to avoid as primer binding sites are user-specified. FrimerS is useful, in particular, for the 
selection of oligonucleotides for microarrays. (The source code for tiie latter two primer selection 
programs may also be obtained from tiiek respective sources and modified to meet the user's specific 
needs.) The PrimeGen program (avaOable to tiie public from the UK Human Genome Mapping 

20 Project Resource Centre, Cambridge UK) designs primers based on multiple sequence aUgnments, 
thereby allowing selection of primers that hybridize to either tiie most conserved or least conserved 
regions of aligned nucleic acid sequences. Hence, tiiis program is useful for identification of botii 
miique and conserved oUgonucleotides and polynucleotide fragments. The oUgonucleotides and 
polynucleotide fragments identified by any of the above selection metiiods are usefiil in hybridization 

25 technologies, for example, as PGR or sequencing primers, microarray elements, or specific probes to 
identify fully or partially complementary polynucleotides m a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 

A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 
tiiat is made by an artificial combination of two or more otherwise separated segments of sequence. 

30 This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as those described in Sambrook, supra. The term recombinant includes nucleic acids timt have 
been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequaitty, a 
recombinant nucleic acid may mclude a nucleic acid sequence operably Imked to a promoter 

35 sequence. Such a recombinant nucleic acid may be part of a vector that is used, for exan[5)le, to 



22 



wo 02/10363 



PCT/USOl/23716 



transfonn a cell. 

Alternatively, such lecombinant nucleic acids may be part of a \riral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a manunal wherein the recombmant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 
5 A *^gulatory element" .refers to a nucleic acid sequencfe usually derived from untranslated 

regions* of a gene and includes enhancers, promoters, introns, and 5' and 3* untranslated regions 
(UTRs). Regulatory elements interact with host or viral protems which control transcription, 
translation, orRNA stability. 

••Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid. 
10 amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 

chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in the art 

An 'TUNA equivalent," in ref^ence to a DNA sequence, is composed of die same linear 
sequence of nucleotides as the reference DNA sequence widi the exception that all occurrences of tiie 
15 nitrogenous base tiiymine are replaced witii uracil, and tiie sugar backbone is con^iosed of ribose 
instead of deoxyribose. 

The term **sample" is used in its broadest sense. A sample suspected of containmg PP, 
nucleic acids encoding PP, or fragments tiiereof may conq)rise a bodily fluid; an extract from a cell, 
chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, in 
20 solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms *'specific bindmg" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
synthetic binding con^osition. The interaction is dependent upon tiie presence of a particular 
structure of the protein, e.g., the antigenic determinant or epitope, recognized by tiie bmding 
25 molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide 

comprising the epitope A, or tiie presence of free unlabeled A, in a reaction contaming free labeled A 
and tiie antibody will reduce tiie amount of labeled A tiiat binds to tiie antibody. 

The term "substantially purified" refers to nucleic acid or amino add sequences that are 
removed from then- natural environment and are isolated or separated, and are at least 60% free, 
30 preferably at least 75% free, and most preferably at least 90% free from otiier components witii which 
they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides 
by different amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support mcluding membranes, filters, 
35 chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
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microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A 'transcript image" refers to the collective pattern of gene expression by a particular cell 
type or tissue under given conditions at a given time. 

5 *Transformatiotf ' describes a process by which exogenous DNA is mtroduced into a recipient 

cell. Transformation may occur under natural or artificial conditions according to various methods 
well known in the art, and may lely on any known method for the insertion of foreign nucleic acid 
sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based 
on the type of host cell being transformed and may include, but is not limited to, bacteriophage or 

10 viral infection, electroporation, heat shock, lipofection, and particle bombardment. The term 
'^transformed cells" includes stably transfonned ceUs in which the inserted DNA is capable of 
K^lication either as an autonomously replicating plasmid or as part of the host chromosome, as well 
as transiently transformed cells which express the inserted DNA or RNA for Ihnited periods of time. 
A "transgenic organism," as used herein, is any organism, including but not limited to 

15 animals and plants, in which one or more of the cells of the organism contains heterologous nucleic 
acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor 
of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
a recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in 

20 vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The 
transgenic organisms contemplated in accordance with the present invention include bacteria, 
cyanobacteria, fungi, plants and animals. The isolated DNA of the present invention can be 
introduced into the host by methods known in the art, for example infection, transfection, 
transformation or transconjugation. Techniques for transferring the DNA of the present invention 

25 into such organisms are widely known and provided in references such as Sambrook et al. (1989). 
supra . 

A 'Variant*' of a particular nucleic acid sequence is defmed as a nucleic acid sequence havmg 
at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the 'BLAST 2 Sequences" tool Version 2.0.9 (May-07- 

30 1999) set at default parameters. Such a pair of nucleic acids may show, for exan5)le, al least 50%, at 
least 60%. at least 70%, at least 80%, at least 85%, at least 90%. at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%. at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defmed length. A variant may be described as, for example, an 
"allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have 

35 significant identity to a reference molecule, but will generally have a greater or lesser number of 
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polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding 
polypeptide may possess additional functional domains or lack domains that are present in the 
reference molecule. Species variants are polynucleotide sequences that vary from one species to 
another. The resulting polypeptides will generally have significant amino acid identity relative to 

5 each other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene 
between individuals of a given species. Polymorphic variants also may encompass "single nucleotide 
polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 
presence of SNPs may be indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

10 A **varianf • of a particular polypeptide sequence is defined as a polypeptide sequence having 

at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with tiie "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 

15 94%, at least 95%, at least 96%, at least 97%, at least 98%. or at least 99% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

« 

THE INVENTION 

The invention is based on the discovery of new human protein phosphatases (PP), the 

20 polynucleotides encoding PP, and the use of these compositions for the diagnosis, treatment, or 
prevention of unmune system disorders, neurological disorders, developmental disraders, and cell 
proliferative disorders. 

Table 1 summarizes the nomenclature for the fidl length polynucleotide and polypeptide 
sequences of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a 

25 single Incyte project identification number (Incyte Project ID). Each polypeptide sequence is denoted 
by both a polypeptide sequence identification number (Polypeptide SBQ ID NO:) and an Incyte 
polypeptide sequence number (hicyte Polypeptide ID) as shown. Each polynucleotide sequence is 
denoted by both a polynucleotide sequence identification nuniber (Polynucleotide SEQ ID NO:) and 
an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. 

30 Table 2 shows sequences with homology to the polypeptides of the invention as identified by 

BLAST analysis against the (jenBank protein (genpept) database. Columns 1 and 2 show the 
polypeptide sequence identification nuniber (Polypeptide SEQ ID NO:) and ttie concesponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the invention. Colunan 3 
shows the GenBank identification number (Gtenbank ID NO:) of the nearest GenBank homolog. 

35 Column 4 shows the probability score for the mateh between each polypeptide and its GenBank 
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homolog. Colmnn 5 shows the annotation of the GenBank homolog along with relevant citations 
where applicable, all of which are expressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Colunans 1 
and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding 
5 Ihcyte polypeptide sequence number (Ihcyte Polypeptide ID) for each polypeptide of the invention. 
Column 3 shows the number of ammo acid residues in each polypeptide. Column 4 shows potential 
phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the 
MOTIFS program of the GCG sequence analysis software package (Genetics Computer Group, 
Madison WI). Column 6 shows amino acid residues conq)rising signature sequences, domains, and 
10 motifs. Column 7 shows analytical methods for protein stmcture/function analysis and in some cases, 
searchable databases to which die analytical methods were applied. 

Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 
properties establish that the claimed polypeptides are protein phosphatases. For example, SEQ ID 
N0:2 is 98% identical to mouse putative protem phosphatase type 2C (GenBank ID g4325051) as 
15 determined by the Basic Lx)cal Alignment Search Tool (BLAST). (See Table 2.) The BLAST 
probabiUty score is l.Oe-89, which indicates the probability of obtaining the observed polypeptide 
sequence alignment by chance. SEQ ID NO:2 also contains a protein phosphatase 2G domain as 
determined by searching for statistically significant matches m the hidden Markov model (HMM)- 
based PFAM database of conserved protein family domains. (See Table 3.) Data from BUMPS, 
20 MOTIFS, and PROFILESCAN analyses provide further corroborative evidence that SEQ ID N0:2 is 
a protein phosphatase 2C. hi an alternative example, SEQ ID N0:4 is 46% identical to human protem 
phosphatase (GenBank ID g6692782) as determined by BLAST. (See Table 2.) The BLAST 
probabiUty score is 2.0e~33. SEQ ID N0:4 also contains a dual specificity phosphatase, catalytic 
domain as determined by searching for statisticaUy significant matches in the HMM-based PFAM 
25 database. (See Table 3.) Data firom BLAST-DOMO analysis provides fiirther corroborative evidence 
that SEQ ID N0:4 is a dual specificity protein phosphatase, hi an alternative exanqple, SEQ ID NO:6 
is 45% identical to murine lysosomal acid phosphatase (GenBank ID g52871) as determined by 
BLAST. (See Table 2.) The BLAST probability score is 2.3e-83. SEQ ID NO:6 also contains a 
histidme acid phosphatase domam as determined by searching for statistically significant matches m 
30 the HMM-based PFAM database. (See Table 3.) Data from BLIMPS, MOTIFS, and PROFEJESCAN 
analyses provide fiirther corroborative evidence that SEQ ID N0:6 is an acid phosphatase. In an 
alternative example, SEQ ID NO:7 is 52% identical to mouse neuronal tyrosme threonine 
phosphatase 1 (GenBank ID §1781037) as detennmed by BLAST. (See Table 2.) The BLAST 
probabiUty score is 1.3e-131. SEQ ID NO:7 also contams a dual specificity phosphatase active site 
35 domam as determined by searchmg for statistically significant matches m the HMM-based PFAM 
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database. (See Table 3.) Data from BLIMPS, MOTIFS, and PROFBLESCAN analyses provide 
further corroborative evidence that SEQ ID N0:7 is a tyrosine phosphatase. In an alternative 
example, SEQ JD N0:8 is 61% identical to human tyrosine phosphatase (GenBank ID g6650693) as 
determined by BLAST. (See Table 2.) The BLAST probability score is LOe-89. SEQ ID N0:8 also 

5 contains a transmembrane domain as determined by searching for statistically significant matches in 
the HMM-based PFAM database. (See Table 3.) Data from MOTIFS analyses provide further 
corroborative evidence that SEQ ID NO:8 is tyrosine specific protem phosphatase. Tyrosine 
phosphatases ate one of two general categories of protein phosphatases. In an alternative exair^le, 
SEQ ID NO:10 is 55% identical to human mitogen-activated protein kinase phosphatase (GenBank ID 

10 g9294745) as determined by BLAST. (See Table 2.) The BLAST probability score is 1.3e-S0. SEQ 
ID NO: 10 also contains a dual specificity phosphatase catalytic domain as determmed by searching 
for statistically significant matches in the HMM-based PFAM database. (See Table 3.) Data from 
BLIMPS, MOTIFS, and PROFDLESCAN analyses provide further corroborative evidence that SEQ 
ID NO: 10 is a protein kinase phosphatase. SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5, and SEQ ID 

15 NO:9 were analyzed and annotated in a similar manner. The algorithms and parameters for the 
analysis of SEQ ID NO: 1-10 are described in Table 7. 

As shown in Table 4, the full length polynucleotide sequences of the present invention were 
assembled using cDNA sequences or coding (exon) sequences derived from genomic DNA, or any 
combination of these two types of sequences. Columns 1 and 2 list the polynucleotide sequence 

20 identification number (Polynucleotide SEQ ID NO:) and the corresponding Incyte polynucleotide 
consensus sequence number (Incyte Polynucleotide ID) for each polynucleotide of the invention. 
Column 3 shows the length of each polynucleotide sequence in basepairs. Column 4 lists fragments 
of the polynucleotide sequences which are useful, for example, in hybridization or amplification 
technologies that identify SEQ ID NO:n-20 or that distinguish between SEQ ID NO:11-20 and 

25 related polynucleotide sequences. Column 5 shows identification numbers corresponding to cDNA 
sequences, coding sequences (exons) predicted from genomic DNA, and/or sequence assemblages 
comprised of both cDNA and genomic DNA. These sequences were used to assemble the full length 
polynucleotide sequences of the invention. Columns 6 and 7 of Table 4 show the nucleotide start (5') 
and stop (3') positions of the cDNA and/or genomic sequences in column 5 relative to their respective 

30 full length sequences. 

The identification numbers in Column 5 of Table 4 may refer specifically, for exan5)le, to 
Incyte cDNAs along with their corresponding cDNA libraries. For exaniple, 6024861H1 is the 
identification number of an Incyte cDNA sequence, and TESTNOTl 1 is the cDNA library from 
which it is derived. lucyte cDNAs for which cDNA libraries are not indicated were derived from 

35 pooled cDNA libraries (e.g., 71907683V1). Altematively, the identification numbers in column 5 
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may refer to GenBank cDNAs or ESTs (e.g., g21 14900) which contributed to the assembly of the 
fuU 

length polynucleotide sequences. Jn addition, the identification numbers in column 5 may identify 
sequences derived from the H^SEMBL (The Sanger Centre, Cambridge, UK) database those 

5 sequences including the designation •'ENST'). Alternatively, the identification numbers in column 5 
may be dmved from the NCBI RefSeq Nucleotide Sequence Records Database (ie., those sequences 
including the designation "NM** or **NT') or the NCBI RefSeq Protein Sequence Records those 
sequences including the designation "W)- Alternatively, the identification numbers in column 5 
may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an "exon 

10 stitching" algorithm. For exaii5)le, ¥LJOQaOCKJ^2Ji2JYYYYJf3Jf4 represents a "stitched" 
sequence in which XXXXXX is the identification number of the clust^ of sequences to which the 
algorithm was applied, and YYYYY is the number of the prediction generated by the algorithm, and 
iVj if present, represent specific exons that may have been manually edited during analysis (See 
Example V). Alternatively, the identification numbers in column S may refer to assemblages of 

15 exons brought together by an "exon-stretching" algorithm. For sample, 

FIJaoaOlX_sAAAAA_^BBBBB J. Jf is the identification nuniber of a "stretched" sequence, with 
XXXXXX being the Incyte project identification number, gAi4AAA being the GenBank identification 
number of the human genomic sequence to which the "exon-stretching" algorithm was applied, 
gPBBBB being the GenBank identification number or NCBI RefSeq identification number of the 

20 nearest GenBank protein homolog, and N referring to specific exons (See Example V). In instances 
where a RefSeq sequence was used as a protein homolog for the "exon-stretching" algorithm, a 
RefSeq identifier (denoted by **NM," "NP," or *OT") may be used in place of tiie G^ank identifier 
(Le.,gBBBBB), 

Alternatively, a prefix identifies component sequences that were hand-edited, predicted from 
25 genomic DNA sequences, or derived from a combination of sequence analysis nniethods. The 

following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN,GFG. 
ENST 


Exon prediction from genonoic sequences using, for exanq>le, 
GENSCAN (Stanford University, CA, USA) or FC3ENES 
(ConQ>ut» Genomics Group, The Sanger Centre, Cambridge; UK). 


GBI 


Hand-edited analysis of genomic sequences. 


FL 


Stitched or stretched genomic sequences (see Example V). 



28 



wo 02/10363 



PCT/USOl/23716 



mcY 



Full length transcript and ©ton prediction from mapping of EST 
sequences to the gsaams. Genomic location and EST conq)osition 
data are combined to predict the exons and resulting transcript. 



la some cases, Ihcyte cDNA coverage redundant with the sequence coverage shown in 
column 5 was obtained to owifinn tiie final consensus polynucleotide sequence, but the relevant 
S Ihcyte dDNA identification numbers are not shown. 

Table 5 shows the i^resentative cDNA libraries for Ihose full Iraigth polynucleotide 
sequences which were assembled ushig Ihcyte cDNA sequences. The representative cDNA library is 
the Ihcyte cDNA Ubtary which is most ftequently represented by the Incyte cDNA sequences which 
were used to assemble and confiim the above polynucleotide sequences. The tissues and vectors 
10 vi*ich were used to construct the cDNA libraries shown in Table 5 are described in Table 6. 

The invention also encompasses PP variants. A prefened PP variant is one which has at least 
about 80%, or alternatively at least about 90%, or even at least about 95% amino acid sequence 
identity to tiie PP amino add sequence, and which contains at least one functional or structural 
charact^stic of PP. 

15 The invention also encon^asses polynucleotides which encode PP. In a particular 

eahbodiment, die invention encompasses a polynucleotide sequence comprising a sequence selected 
from the group consisting of SEQ ID NO:11-20, which encodes PP. The polynucleotide sequences of 
SEQ ID NO: 11-20, as presented in die Sequence Listing, embrace the equivalent RNA sequences, 
wherein occurrences of die nitrogenous base thymine are replaced widi uracU. and the sugar backbone 

20 is composed of ribose instead of deoxyribose. 

The invention also encompasses a variant of a polynucleotide sequence encoding PP. la 
particular, such a variant polynucleotide sequence wiU have at least about 70%, or alteanatively at 
least about 85%, or even at least about 95% polynucleotide sequence identity to the polynucleotide 
sequence encodmg PP. A particular aspect of flie invention encompasses a variant of a 

25 polynucleotide sequence con5)rising a sequence selected from die group consisting of SEQ ID 
NO: 1 1-20 which has at least about 70%, or alternatively at least about 85%, or even at least about 
95% polynucleotide sequence identity to a nucleic add sequence sdected ftxjm the group consisting 
of SEQ ID NO: 1 1-20. Any one of tiie polynucleotide variants described above can encode an amino 
acid sequence which contams at least raie functional or stinctural characteristic of PP. 

30 It will be appreciated by those skilled in flie art tiiat as a result of die degeneracy of the 

genetic code, a multifaide of polynucleotide sequences encodmg PP, some bearing mmimal similarily 
to tiie polynucleotide sequences of any known and naturally occurring gene, may be produced. Thus, 
tiie invention contemplates each and every possible variation of polynucleotide sequence tiiat could 
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be made by selecting combinations based on possible codon choices. These combinations are made 
in accordance with the standard triplet genetic code as applied to the polynucleotide sequence of 
naturally occurring PP, and all such variations are to be considered as bemg specifically disclosed. 
Although nucleotide sequences which encode PP and its variants aie generally capable of 

5 hybridizing to the nucleotide sequence of the naturally occurring PP under appropriately selected 
conditions of stringency, it may be advantageous to produce nucleotide sequences encoding PP or its 
derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally occurring 
codons. Codons may be selected to increase the rate at which expression of the peptide occurs m a 
particular p^okaiyotic or eukaryotic host in accordance with the frequency with which particular 

10 codons are utilized by the host. Other reasons for substantially altering the nucleotide sequence 
encoding PP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties, such as a greater half-life, than 
transcripts produced from the naturally occiirring sequence. 

The invention also encompasses production of DNA sequences which encode PP and PP 

15 derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the synthetic 
sequence may be inserted into any of the many available expression vectors and cell systems using 
reagents well known in the art. Moreover, synthetic chemistry may be used to introduce mutations 
into a sequence encoding PP or any fragment thereof. 

Also encompassed by the invention are polynucleotide sequences that are capable of 

20 hybridizing to the clahned polynucleotide sequences, and, in particular, to those shown in SEQ ID 

NO: 11-20 and fragments thereof under various conditions of stringency. (See, e.g., Wahl, G.M. and 

S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 

152:507-5 11.) Hybridization conditions, including annealing and wash conditions, are described in 

* 

"Definitions." 

25 Methods for DNA sequencing are well known in the art and may be used to practice any of 

the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Qeveland OH), Taq polymerase (Applied 
Biosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech, Kscataway NJ), or 
combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE 

30 amplificatian system (Life Technologies, Gaithersburg MD). Preferably, sequence preparation is 
automated with machines such as the MIOIOLAB 2200 liquid transfer system (Hamilton, Reno NV), 
PTC200 thermal cycler (MJ Research. Watertown MA) and ABI CATALYST 800 thermal cycler 
(Applied Biosystems). Sequencing is then carried out using either the ABI 373 or 377 DNA 
sequencing system (Applied Biosystems), the MEGABACE 1000 DNA sequencing system 

35 (Molecular Dynamics, Sunnyvale C!A), or other systems known in the art The resulting sequences 
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are analyzed using a variety of algorithms which are well known in the art (See, e,g., Ausubel, F.M. 
(1997) Short Protocols in Molecular Biology, John Wiley & Sons, New York NY. unit 7.7; Meyers, 
R.A. (1995) Molecular Biology and Biotechnology. Wiley VCH, New York NY, pp. 856-853.) 
The nucleic acid sequences encoding PP may be extended utilizing a partial nucleotide 

5 sequence and employing various PCR-based methods known in the art to detect upstream sequences, 
such as promoters and regulatory elements. For exainple, one method which may be en5)loyed, 
restriction-site PC!R, uses universal and nested primers to amplify unknown sequence from genomic 
DNA within a cloning vector. (See, e.g., Sarkar, G. (1993) PGR Methods Applic. 2:318-322.) 
Another method, inverse PGR. uses primers that extend in divergent directions to amphfy unkuown 

10 sequence from a circularized template. The template is derived from restriction fragments comprising 
a known genomic locus and surrounding sequences. (See, e.g., Triglia, T. et al. (1988) Nucleic Acids 
Res. 16:8186.) A third method, capture PGR, involves PGR amplification of DNA fragments 
adjacent to known sequences in human and yeast artificial chromosome DNA. (See, e.g., Lagerstrom, 
M. et al. (1991) PGR Methods Applic. 1: 11 1-119.) In this method, multiple restriction enzyme 

15 digestions and ligations may be used to insert an engineered double-stranded sequence into a region 
of unknown sequence before performing PGR. Other methods which may be used to retrieve 
unknown sequences are known in the art. (See, e.g., Parker, J.D. et al. (1991) Nucleic Acids Res. 
19:3055-3060). Additionally, one may use PGR, nested primers, and PROMOTERFINDER libraries 
(Qontech, Palo Alto GA) to walk genomic DNA. This procedure avoids the need to screen libraries 

20 and is usefiil in finding intron/exon junctions. For all PGR-based methods, primers may be designed 
using commercially available software, such as OLIGO 4.06 primer analysis software (National 
Biosciences, Plymouth MN) or another appropriate program, to be about 22 to 30 nucleotides in 
length, to have a GG content of about 50% or more, and to anneal to the template at temperatures of 
about 68°G to 72°G. 

25 When screening for full length cDNAs, it is preferable to use libraries that have been 

size-selected to include larger cDNAs. Jn addition, random-primed libraries, which often include 
sequences containing the S' regions of genes, are preferable for situations in which an oligo d(T) 
library does not yield a full-length cDNA. Genomic libraries may be usefid for extension of sequence 
mto 5' non-transcribed regulatory regions. 

30 (Capillary electrophoresis systems which are commercially available may be used to analyze 

the size or confirm the nucleotide sequence of sequencing or FCR products. In particular, capillary 
sequencing may en^loy flowable polymers for electrophoietic separation, four different nucleotide- 
specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be converted to electrical signal usmg appropriate 

35 software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
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process from loading of sanq)les to computer analysis and electronic data display may be computer 
controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments 
which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotide sequences or fragments thereof 
5 which encode PP may be cloned in recombinant DNA molecules that direct expression of PP, or 
fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy 
of the genetic code, other DNA sequences which encode substantiafly the same or a fimctionaUy 
equivalent amino acid sequence may be produced and used to express PP. 

The nucleotide sequences of the present invention can be engineered using methods generally 
10 known in the art in order to alter PP-encoding sequences for a variety of purposes including, but not 
limited to, modification of the cloning, processing, and/or expression of the gene product DNA 
shuffling by random fragmentation and PGR reassembly of gene fragments and synthetic 
oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleotide- 
mediated site-directed mutagenesis may be used to introduce mutations that create new restriction 
15 sites, alter glycosylation patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of the present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 
5,837.458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians, RC et al. (1999) Nat 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or 
20 improve the biological properties of PP, such as its biological or enzymatic activity or its ability to 
bind to other molecules or compounds. DNA shuffling is a process by which a Ubrary of gene 
variants is produced using PCR-mediated recombination of gene fragments. The library is then 
subjected to selection or screening procedures that identify those gene variants with the desired 
properties. These preferred variants may then be pooled and further subjected to recursive rounds of 
25 DNA shuffling and selection/screenmg. Thus, genetic diversity is created through "artificial" 

bleeding and rapid molecular evolution. For example, fragments of a smgle gene contauiing random 
point mutations may be recombined, screened, and then reshuflEled until the desked properties are 
optimized. Alternatively, fragments of a given gene may be recombmed with fragments of 
homologous genes in the same gene family, either from the same or difBsrent species, thereby 
30 maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
manner. 

In another embodiment, sequences encodmg PP may be synthesized, m whole or in part, 
usmg chemical methods weU known in the art. (See. e.g., Caruthers. M.H. et al. (1980) Nucleic Acids 
Symp. Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Synq). Ser. 7:225-232.) 
35 Altematively, PP itself or a fragment thereof may be synthesized using chemical methods. For 
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example, peptide synthesis can be perfonned using various solution-fhase or solid-phase techniques. 
(See. e.g.. Creighton, T. (1984) Pmr>.ms. Structure Mn1,^.1ar Properties. WHReeman. New 
York NY, pp. 55-60; and Roberge. J.Y. et al. (1995) Science 269:202-204.) Automated synthesis 
may be achieved usmg the ABI 431A peptide synthesizer (Applied Biosystems). Additionally, the 
5 amino acid sequence of PP. or any part thereof, may be aheied during direct synthesis and/or 

combined with sequences ftom other protems, or any part thereof, to produce a variant polypeptide or 
a polypeptide having a sequence of a naturally occurring polypeptide. 

The peptide may be substantiaUy purified by preparative high performance Kquid 
chromatography. (See. e.g.. Chiez. R.M. and F.Z. Regnier (1990) Methods Enzymol. 182:392-421.) 
10 The con5)Osition of the synthetic peptides may be confirmed by amino acid analysis or by 
sequencmg. (See, e.g., Crei^ton, supra, pp. 28-53.) 

In order to express a biologically active PP. tiie nucleotide sequences encoding PP or 
derivatives thereof may be inserted mto an appK>priate expression vector. i.e.. a vector which contams 
the necessary elements for transcriptional and translational control of the inserted coding sequence in 
15 a suitable host These elements include reguhitory sequences, such as enhancers, constitutive and 
inducible promoters, and 5'and 3'untranslated regions in tiie vector and in polynucleotide sequences 
encoding PP. Such elements may vary in their strengtii and specificity. Specific mitiation signals 
may also be used to achieve more efficient translation of sequences encodmg PP. Such signals 
mclude the ATG mitiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where 
20 sequences encodmg PP and its mitiation codon and upstream regulatory sequences are inserted into 
the appropriate expression vector, no additional transcriptional or translational control signals may be 
needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous 
translational contn)l signals including an in-frame ATG initiation codon should be provided by the 
vector. Exogenous translational elements and initiation codons may be of various origins, botii 
25 natural and syntiietic. The efficiency of expression may be enhanced by flie mclusion of enhancers 
appropriate for tiie particular host ceU system used. (See, e.g., Scharf. D. et al. (1994) Results Probl. 
CeUDififer. 20: 125-162.) 

Mettiods which are weU known to tiiose skiUed m tiie art may be used to construct expression 
vectors containing sequences encoding PP and appropriate transcriptional and tnmslational control 
30 elements. These metiiods include in vitro recombinant DNA techniques, syntiietic techniques, and ia 
ym genetic recombination. (See. e.g., Sambrook, J. et al. (1989) Molernlarnoninp , A Laboratory 
Manual . Cold Spring Harbor Press. Plainview NY. ch. 4. 8, and 16-17; Ausubel, F.M. et al. (1995) 
n.,r,«T.t Protocols MniP^nlar Biology. John WUey & Sons. New YoricNY. ch. 9. 13, and 16.) 

A variety of expression vector/host systems noay be utilized to contain and express sequences 
encoding PP. These include, but are not limited to, microorganisms such as bacteria transformed witii 
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recoinbiiiant bacteriophage, plasmid. or cosmid DNA expression vectors; yeast transformed with 
yeast expression vectors; msect cell systems infected with viral expression vectors {e.g., baculovirus); 
plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV. 
or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 

5 annual cell systems. (See, e.g., Sambrook, supra ; Ausubel, supra; Van Heeke, G. and S.M. Schuster 
(1989) J. Biol. Chem. 264:5503-5509; Engelhard, E.K. et al. (1994) Proc. Natl. Acad. Sci. USA 
91:3224-3227; Sandig, V. et aL (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO 
J. 6:307-311; The McGraw Hill Yearbook o f Science and Technoloev (1992) McGraw Hill. New 
Yoric NY, pp. 191-196; Logan, J. and T. Shenk (1984) Ptoc. NatL Acad. Sci. USA 81:3655-3659; and 

10 Harrington. J.J. et al. (1997) Nat Genet 15:345-355.) Expression vectors derived ftom retroviruses, 
adenovfruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for 
delivery of nucleotide sequMces to the targeted organ, tissue, or cell population. (See, e.g., Di 
Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al. (1993) Proc. Nad. Acad. Sci. 
USA 90(13):6340-6344; Buller. R.M. et al. (1985) Nature 317(6040):813-815; McGregor. D.P. et al, 

15 (1994) Mol. Immunol. 31(3):219-226; and Verma, LM. and N. Somia (1997) Nature 389:239-242.) 
The invention is not limited by the host cell employed. 

Jn bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intended for polynucleotide sequences encoding PP. For example, routine cloning, 
subcloning, and propagation of polynucleotide sequences encoding PP can be achieved using a 

20 multifunctional E. coK vector such as PBLUESCRIPT (Stratagene, La JoUa CA) or PSPORTl 

plasmid (Life Technologies). Ligation of sequences encoding PP into the vector's multiple cloning 
site disrupts the lacZ gene, allowing a colorimetric screening procedure for identification of 
transformed bacteria containing recombinant molecules. In addition, these vectors may be useful for 
in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of 

25 nested deletions m tiie cloned sequence. (See, e.g.. Van Heeke, G. and S.M. Schuster (1989) J. Biol. 
Chem. 264:5503-5509.) When large quantities of PP are needed, e.g. for the production of 
antibodies, vectors which direct high level expression of PP may be used. For cxmxple, vectors 
containing the strong, inducible SP6 or T7 bacteriophage promoter may be used. 

Yeast expression systems may be used for production of PP. A munber of vectors containing 

30 constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH promoters, may 
be used in the yeast Saccharomvces cerevisiae or Pichia pastoris . In addition, such vectors direct 
eiflier the secretion or intracellular retention of expressed protems and enable integration of foreign 
sequences into tiie host genome for stable propagation. (See, e.g., Ausubel, 1995, su^a; Bitter, G.A. 
et al (1987) Metiiods EnzymoL 153:516-544; and Scorer, CA. et al. (1994) Bio/Technology 12:181- 

35 184.) 
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Plant systems may also be used for expression of PP. Transcription of sequences encoding 
PP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used alone or in 

combination with the omega leader sequence fromTMV (Takamatsu, N. (1987) EMBO J. 

6:307-311). Alternatively, plant promoters such as the small subunit of RUBBCO or heat shock 
5 promoters may be used. (See, e.g., Coruzsd. G. et al. (1984) E^4BO J. 3:1671-1680; Broglie. R. et al. 

(1984) Science 224:838-843; and Wmter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105.) 

These constructs can be mtroduced into plant cells by direct DNA transformation or 

pathogen-mediated transfection. (See, e.g.. The McGraw H ill Yearbook of Science and Technology 

(1992) McGraw Hill. New York NY, pp. 191-196.) 
10 In mammalian cells, a number of viral-based expression systems may be utilized. In cases 

where an adenovmis is used as an expression vector, sequences encoding PP may be ligated mto an 

adenovirus transcription/translation con5)lex consistmg of the late promoter and tripartite leader 

sequence. IhsMtion in a non-essential El or E3 region of the viral genome may be used to obtain 

infective vmis which expresses PP m host cells. (See, e.g.. Logan. J. and T. Shenk (1984) Proc. Natl. 
15 Acad. Sci. USA 81:3655-3659.) In addition, transcription enhancers, such as the Rous sarcoma virus 

(RSV) enhancer, may be used to increase expression in mammalian host cells. SV40 or EB V-based 

vectors may also be used for high-level protein expression. 

Human artificial chromosomes (HACs) may also be en^Ioyed to deliver larger fragments of 

DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
20 constructed and delivered via conventional delivery methods (liposomes, polycationic ammo 

polymers, or vesicles) for therapeutic purposes. (See, e.g., Harrington, J.J. et al. (1997) Nat. Genet. 

15:345-355,) 

For long term production of recombinant proteins in mammalian systems, stable expression 
of PP in cell lines is preferred. For example, sequences encoding PP can be transformed into cell 

25 lines using expiession vectors which may contain viral origins of replication and/or endogenous 

expression elements and a selectable marker gene on tiie same or on a separate vector. Following the 
introduction of the vector, cells may be aUowed to grow for about 1 to 2 days in enriched media 
before being switched to selective media. The purpose of tiie selectable marker is to confer resistance 
to a selective agent, and its presence allows growth and recovery of cells which successfully express 

30 tiie introduced sequences. Resistant clones of stably transformed cells may be propagated using 
tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex vmis tiiymidme kmase and adenine 
phosphoribosyltransferase genes, for use in tk and apr cells, respectively. (See, e.g., Wigler. M. et 

35 al. (1977) CeU 1 1:223-232; Lowy, I. et al. (1980) CeU 22:817-823.) Also, antimetabolite, antibiotic. 
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or herbicide resistance can be used as the basis for selection. For exan^le, dhfr confers resistance to 
methotiexate; mo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 
confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively. (See, e.g., 
Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, R et al. (1981) 

5 J. Mol. Biol. 150:1-14.) Additional selectable genes have been described, e.g., trpB and hisD, which 
alter cellular requirements for metabolites. (See, e.g., Hartman, S.C. and R,C. Mulligan (1988) Proc. 
NatL Acad. Sci. USA 85:8047-8051.) Visible markers, e.g., anthocyanins, green fluorescent proteins 
(GFP; Qontech), B glucuronidase and its substrate 6-gJucuronide, or luciferase and its substrate 
lucifmn may be used. These markers can be used not only to identify transfomaants, but also to 

10 quantify the amount of transient or stable protem expression attributable to a specific vector system. 
(See, e.g., Rhodes, C.A. (1995) Methods Mol. Biol. 55:121-131.) 

Althou^ the presence/absence of marker gene expression suggests that the gene of interest is 
also present, the presence and expression of the gene may need to be confirmed. For example, if the 
sequence encoding PP is inserted within a marker gene sequence, transformed cells containing 

15 sequences encoding PP can be identified by the absence of marker gene function. Alternatively, a 
marker gene can be placed in tandem with a sequence encoding PP under the control of a single 
promoter. Expression of the marker gene m response to induction or selection usually indicates 
expression of the tandem gene as well. 

In general, host cells that contain the nucleic acid sequence encoding PP and that express PP 

20 may be identified by a variety of procedures known to those of skill in the art. These procedures 
include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PGR ampUfication, and 
protein bioassay or immunoassay techniques which include membrane, solution, or chip based 
technologies for the detection and/or quantification of nucleic acid or protein sequences. 

Immunological methods for detecting and measuring the expression of PP usmg either 

25 specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
mclude enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-interfermg epitopes on PP is prefened, but a competitive 
binding assay may be employed. These and other assays are well known in the art (See, e.g., 

30 Hampton, R. et al. (1990) Serological Methods, a Laboratory Manual . APS Press, St Paul MN, Sect 
IV; Coligan, J.E. et al. (1997) Current Protocols in Immunology , Greene Pub. Associates and Wiley- 
hiterscience. New York NY; and Pound, J.D. (1998) Immiinoche mical Protocols . Humana Press, 
TotowaNJ.) 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
35 may be used in various nucleic acid and amino acid assays. Means for producing labeled 
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hybridization or PGR probes for detecting sequences related to polynucleotides encoding PP include 
oligolabeling, nick translation, end-labeling, or PGR an^jlification using a labeled nucleotide. 
Alternatively, the sequences encoding PP, or any fragments thereof, may be cloned into a vector for 
the production of an mRNA probe. Such vectors are known in the art, are commercially available, 

5 and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 
such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety 
of commercially available kits, such as those provided by Amersham Pharmacia Biotech, Promega 
(Madison WI), and US Biochemical. Suitable reporter molecules or labels which may be used for 
ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic 

10 agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with nucleotide sequences encoding PP may be cultured under 
conditions suitable for the expression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intracellularly depending on the sequence 
and/or the vector used. As will be understood by those of skill in the art, expression vectors 

15 containing polynucleotides which encode PP may be designed to contain signal sequences which 
dkect secretion of PP through a prokaryotic or eukary otic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted sequences or to process the expressed protein in the desired fashion. Such modifications of 
the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, 

20 phosphorylation, lipidation, and acylation. Post-translational processmg which cleaves a "prepro" or 
"pro" form of the protein may also be used to specify protein targeting, folding, and/or activity. 
Different host cells which have specific cellular machinery and characteristic mechanisms for 
post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WD8) aie available from tiie 
American Type Culture Collection (ATCC, Manassas VA) and may be chosen to ensure the correct 

25 modification and processing of the foreign protein. 

Iq anotiier embodiment of the invention, natural, modified, or recombinant nucleic acid 
sequences encoding PP may be ligated to a heterologous sequence resulting m translation of a fusion 
protein in any of the aforementioned host systems. For example, a chimeric PP protein containing a 
heterologous moiety that can be recognized by a commercially available antibody may facilitate the 

30 screenmg of peptide libraries for inhibitors of PP activity. Heterologous protem and peptide moieties 
may also facilitate purification of fusion protems using commercially available afBnity matrices. 
Such moieties include, but are not limited to, glutathione S-transferase (GST), maltose binding 
protein (MBP), diioredoxin (Trx), calmodulin bmding peptide (CBP). 6-His, FLAG, c-wyc, and 
hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of their cognate fusion 

35 proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and metal-chelate 
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resins, respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of 
fusion proteins using commercially available monoclonal and polyclonal antibodies that specifically 
recognize these epitope tags. A fusion protein may also be engineered to contain a proteolytic 
cleavage site located between the PP encoding sequence and the heterologous protein sequence, so. 
5 that PP may be cleaved away from the heterologous moiety following purification. Methods for 
fusion protem expression and purification are discussed in Ausubel (1995, supra , ch. 10). A variety 
of commercially available kits may also be used to facilitate expression and purification of fusion 
proteins. 

In a further embodiment of the mvention, synthesis of radiolabeled PP may be achieved in 

10 vitro using the TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These 

systems couple transcription and translation of protein-codiag sequences operably associated with the 
T7, T3, or SP6 promoters. Translation takes place in the piesence of a radiolabeled amino acid 
precursor, for example, ^^S-methionine. 

PP of the present invention or firagments thereof may be used to screen for compounds that 

15 specifically bind to PP. At least one and up to a plurality of test compounds may be screened for 
specific bindmg to PP. Examples of test compounds include antibodies, oligonucleotides, proteins 
(e.g., receptors), or small molecides. 

In one embodiment, the compound thus identified is closely related to the natural ligand of 
PP, e.g., a ligand or firagment thereof, a natural substrate, a structural or functional mimetic, or a 

20 natural bmding partner. (See, e.g., Coligan, J.E. et al. (1991) Current Prot ocols m Immunology 1(2): 
Chapter 5.) Similarly, the compound can be closely related to the natural receptor to which PP binds, 
or to at least a fragment of the receptor, e.g., the ligand bmdmg site. In either case, the compound 
can be rationally designed using known techniques. In one embodiment, screening for these 
compounds involves producing appropriate cells which express PP, either as a secreted protein or on 

25 the cell membrane. Preferred cells include cells fiiom mammals, yeast, Drosophila, or E. coli. Cells 
expressmg PP or cell membrane fractions which contam PP are then contacted with a test compound 
and binding, stunulation, or inhibition of activity of either PP or the compound is analyzed. 

An assay may swaply test binding of a test compound to the polypeptide, wherein binding is 
detected by a fluorophorc, radioisotope, enzyme conjugate, or other detectable label. For example, 

30 the assay may comprise the steps of combining at least one test conq)ound with PP, either in solution 
or affixed to a solid support, and detecting tiie binding of PP to the compound. Alternatively, the 
assay may detect or measure bmding of a test conq>ound m flie presence of a labeled competitor. 
Additionally, the assay may be carried out using cell-ftee preparations, chemical libraries, or natural 
product mbctures, and the test compound(s) may be free in solution or affixed to a solid support. 
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PP of the present invention or fragments thereof niay be used to screen for compounds that 
modulate the activity of PP. Such compounds may include agonists, antagonists, or partial or inverse 
agonists. La one embodiment, an assay is performed under conditions permissive for PP activity, 
wherein PP is combined with at least one test conqwund, and the activity of PP in the presence of a 

5 test conq)Ound is compared with the activity of PP in the absence of the test compound. A change in 
the activity of PP in the presence of tiie test compound is indicative of a compound that modulates the 
activity of PP. Alternatively, a test compovaxd is combined witii an ;p vitro or cell-free system 
comprising PP under conditions suitable for PP activity, and die assay is performed. In eitiier of tiiese 
assays, a test conq)ound which modulates die activity of PP may do so indirecfly and need not come 

10 in direct contact with die test con?)ound. At least one and up to a plurality of test con?>ounds may be 
screened. 

In another embodiment, polynucleotides encoding PP or tiieir mammaUan homologs may be 
"knocked out" in an animal model system using homologous recombination in embryonic stem (ES) 
cells. Such techniques are well known in die art and are useful for the generation of animal models of 

15 human disease. (See, e.g., U.S. Patent Number 5,175,383 and U.S. Patent Number 5.767,337.) For 
exan5)le, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse 
embryo and grown in culture. The ES cells are transformed witii a vector containing the gene of 
interest disrupted by a marker gene^ e.g., die neomycin phosphotransferase gene (neo; Capecchi, M.R. 
(1989) Science 244:1288-1292). The vector integrates mto the corresponding region of die host 

20 genome by homologous recombmation. Alternatively, homologous recombination takes place using 
the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific 
manner (Marth, J.D. (1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids 
Res. 25:4323-4330). Transformed ES cells are identified and microinjected into mouse cell 
blastocysts such as tiiose from the C57BL/6 mouse strain. The blastocysts are surgically ttansfened 

25 to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce 

heterozygous or homozygous strains. Transgenic animals tiius generated may be tested widi potential 
therapeutic or toxic agents. 

Polynucleotides encoding PP may also be manipulated m vitro in ES cells derived from 
human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 

30 lineages including endoderm, mesodermi, and ectodennal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. 
(1998) Science 282:1145-1147). 

Polynucleotides encoding PP can also be used to create "knockin" humanized animals (pigs) 
or transgenic animals (mice or rats) to model human disease. With knockin technology, a region of a 

35 polynucleotide encoding PP is mjected into animal ES cells, and tfie injected sequence mtegrates into 
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the animal cell genome. Transformed cells are injected into blastulae, and the blastulae are implanted 
as described above. Transgenic progeny or inbred lines are studied and treated with potential 
pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, a 
mammal inbred to overexpress PP, e.g., by secreting PP in its noilk, may also serve as a convenient 
5 source of that protein (lanne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
between regions of PP and protein phosphatases. In addition, the expression of PP is closely 
associated with thalamus, pancreas, testis, bram, vascular, and fetal lung tissues, as well as colon 

10 tissue pseudopolyps associated with multiple tubuvillous adenomas. Therefore, PP appears to play a 
role in unmune system disorders, neurological disorders, developmental disorders, and cell 
proliferative disorders. In the treatment of disorders associated with mcreased PP expression or 
activity, it is desirable to decrease the expression or activity of PP. In the treatment of disorders 
associated with decreased PP expression or activity, it is desirable to increase the expression or 

15 activity of PP. 

Therefore, in one embodiment, PP or a fragment or derivative tiiereof may be administered to 
a subject to treat or prevent a disorder associated with decreased expression or activity of PP. 
Examples of such disorders include, but are not limited to, an immune system disorder, such as 
acquired immunodeficiency syndrome (AIDS), X-linked agammaglobinemia of Bniton. common 

20 variable immunodeficiency (CVI), DiGeorge's syndrome (tiiymic hypoplasia), thymic dysplasia, 

isolated IgA deficiency, severe combined immunodeficiency disease (SOD), immunodeficiency with 
thrombocytopenia and eczenoa (Wiskott-Aldrich syndrome), Chediak-Higashi syndrome, chronic 
granulomatous diseases, hereditary angioneurotic edema, immunodeficiency associated with 
Cushing's disease, Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 

25 spondylitis, amyloidosis, anemia, astimia, atherosclerosis, autoimmune hemolytic anemia, 
autoinmiune thyroiditis, autoimmune polyendocrinopathy-candidiasis-ectodennal dystrophy 
(APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, 
dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, 
erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's 

30 syndrome, gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel 
syndrome, multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, 
osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid 
arthritis, scleroderma, Sj5gren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, 
systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, 

35 con^lications of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, fungal, 
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parasitic, protozoal, and helminthic infections, and trauma; a neurological disorder, such as epilepsy, 
ischemic cerebrovascular disease, stroke, cerebral neoplasms, Aldieimer's disease, Kck's disease, 
Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic 
lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis 

5 pigmentosa, hereditary ataxias, multiple sclerosis and other demyelmating diseases, bacterial and 
viral meningitis, brain abscess, subdural GmpyemUt epidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Crcutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinter syndrome, fatal 
familial insomnia, nutritional and metabolic diseases of the nervous systenoi, neurofibromatosis, 

10 tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigenunal syndrome, mental 
retardation and other developmental disorders of the central nervous system including Down 
syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve 
disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral 
nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and 

15 toxic myopathies, myasthenia gravis, periodic paralysis, mental disord^ including mood, anxiety, 
and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, 
diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, 
Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial 
frontotemporal dementia; a developmental disorder, such as renal tubular acidosis, anemia, Cushing's 

20 syndrome, achondroplastic dwarfism, Ducherme and Becker muscular dystrophy, epilepsy, gonadal 
dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental 
retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial 
dysplasia, hereditary keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and 
neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea 

25 and cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and 
sensorineural hearing loss; and a ceU proliferative disorder, such as actinic keratosis, arteriosclerosis, 
atherosclerosis, bursitis, cirrhosis, hepatitis, mixed coimective tissue disease (MCTD), myelofibrosis, 
paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and 
cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, 

30 teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, 
breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, 
pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. 

hi another embodiment, a vector capable of expressing PP or a fragment or derivative thereof 
may be administered to a subject to treat or prevent a disorder associated with decreased expression 

35 or activity of PP including, but not limited to, those described above. 
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In a further enibodiment, a composition comprising a substantially purified PP in conjunction 
with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent a disorder 
associated with decreased expression or activity of PP including, but not limited to, those provided 
above. 

5 hi still another embodhnent, an agonist which modulates die activity of PP may be 

admmistered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of FP including, but not limited to, those listed above. 

hi a further embodhnent, an antagonist of PP may be administered to a subject to treat or 
prevent a disorder associated with mcreased expression or activity of PP. Examples of such disorders 

10 include, but are not limited to, those immune system disorders, neurological disorders, developmental 
disorders, and cell proliferative disorders described above. Li one aspect, an antibody which 
specifically bmds PP may be used direcfly as an antagonist or mdkecfly as a targeting or delivery 
mechanism for bringing a pharmaceutical agent to cells or tissues which express PP. 

hi an additional embodiment, a vector expressuag the con:q>lement of the polynucleotide 

15 encoding PP may be administered to a subject to treat or prevent a disorder associated witii mcreased 
expression or activity of PP including, but not limited to. those described above. 

hi other embodiments, any of the protems, antagonists, antibodies, agonists, con5>lementary 
sequences, or vectors of the invention may be administered hi combmation witia other appropriate 
therapeutic agents. Selection of the sqjpropriate agents for use m combination tiierapy may be made 

20 by one of ordinary skill in the art, accorxfing to conventional pharmaceutical principles. The 

combination of therapeutic agents ma:y act synergistically to effect the treatment or prevention of the 
various disorders described above. Using tiiis approach, one may be able to achieve therapeutic 
efficacy with lower dosages of each agent, thus reducmg tiie potential for adverse side effects. 

An antagonist of PP may be produced using methods which axe generally known m tiie art. In 

25 particular, purified PP may be used to produce antibodies or to screen libraries of pharmaceutical 
agents to identify those which specifically bmd PP. Antibodies to PP may also be generated usmg 
methods that are well known m the art. Such antibodies may include, but are not Imiited to, 
polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and fragments 
produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit dmier 

30 formation) are generally preferred for therapeutic use. 

For the production of antibodies, various hosts iucluding goats, rabbits, rats, mice, humans, 
and others may be immunized by injection with PP or widi any fragment or oligopeptide thereof 
which has immunogenic properties. Depending on the host species, various adjuvants may be used to 
increase immunological response. Such adjuvants include, but are not limited to, Freund's, mmeral 

35 gels such as aluminum hydroxide, and surface active substances such as lysolecithm, pluronic 
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polyols, polyanions, peptides, oil emulsions. KLH. and dinitrophenol. Among adjuvants used in 

humans, BCG (bacilli Calnaette-Guerin) and Corvnebact ftrinni parviim are especially preferable. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to PP 

have an amino acid sequmce consisting of at least about 5 amino acids, and generally wDl consist of 
5 at least about 10 ammo acids. It is also preferable that these oligopeptides, peptides, or fragments are 

identical to a portion of the amino acid sequence of the natural protein. Short stretches of PP amino 

acids may be fused with those of another protein, such as KLH, and antibodies to the chimeric 

molecule may be produced. 

Monoclonal antibodies to PP may be prepared using any technique which provides for the 
10 production of antibody molecules by continuous cell lines in culture. These mclude, but are not 

limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma 

technique. (See, e.g., Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. 

Immunol. Metiiods 81:31-42; Cote, RJ. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; and 

Cole, S.P. et al. (1984) Mol. Cell Biol. 62:109-120.) 
15 In addition, techniques developed for the production of "chimeric antibodies," such as the 

splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 

antigen specificity and biological activity, can be used, (See, e.g., Morrison, S.L. et al. (1984) Proc. 

Nati. Acad. Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; and Takeda, 

S. et al. (1985) Nature 314:452-454.) Alternatively, techniques described for the production of smgle 
20 chain antibodies may be adapted, using methods known in the art, to produce PP-specific single chain 

antibodies. Antibodies with related specificity, but of distinct idiotypic conq>osition, may be 

generated by chain shuffling from random combinatorial immunoglobulin libraries. (See, e.g.. 

Burton, D.R. (1991) Proc. Nati. Acad. Sci. USA 88:10134-10137.) 

Antibodies may also be produced by mducmg in vivo production in the lynqphocyte 
25 population or by screening immunoglobulin libraries or panels of highly specific bindmg reagrats as 

disclosed in die literature. (See, e.g., Qrlandi. R. et al. (1989) Proc. Nad. Acad. ScL USA 

86:3833-3837; Winter. G. et al. (1991) Nature 349:293-299.) 

Antibody fragmuents which contain specific binding sites for PP may also be generated. For 

example, such fragments include, but are not limited to, F(ab')2 fragments produced by pepsin 
30 digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 

the F(ab')2 fi:agments. Alternatively, Fab expression libraries may be constructed to allow rapid and 

easy identification of monoclonal Fab fiagments with the desired specificity. (See. e.g., Huse, W.D. 

et al. (1989) Science 246: 1275-1281.) 

Various immunoassays may be used for screening to identify antibodies having the desired 
35 specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
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polyclonal or monoclonal antibodies with established specificities are well known in the art Such 
immunoassays typically involve the measurement of con5)lex formation between PP and its specific 
antibody, A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to 
two non-interfering PP epitopes is generally used, but a conqpetitive binding assay may also be 

5 employed (Pound, supra) . 

Various methods such'as Scatchard analysis in conjunction with radioinomunoassay 
techniques may be used to assess the afBnity of antibodies for PP. AfGnity is expressed as an 
association constant, Ka, which is defined as the molar concentration of PP-antibody conq)lex divided 
by the noolar concentrations of fi:ee antigen and ftce antibody under equilibrium conditions. The 

10 determined for a preparation of polyclonal antibodies, which are heterogeneous in their affinities for 
naultiple PP epitopes, represents the average affinity, or avidity, of the antibodies for PP. The 
determined for a preparation of monoclonal antibodies, which are monospecific for a particular PP 
epitope, represents a true measure of affinity. High-affmity antibody preparations with K„ ranging 
from about 10^ to 10^^ L/moIe are preferred for use in immunoassays in which the PP-antibody 

15 . conqilex must withstand rigorous manipulations. Low-affinity antibody preparations with ranging 
fix)m about 10^ to 10^ L/mole are preferred for use in immunopurification and similar procedures 
which ultimately require dissociation of PP, preferably in active form, from the antibody (Catty, D. 
(1988) Antibodies, Volume I: A Practical Approach, IRL Press, Washington DC; Liddell, J.E. and A. 
Cryer (1991) A Practical Guide to Monoclonal Antibodies , John Wiley & Sons, New York NY). 

20 The titer and avidity of polyclonal antibody preparations may be further evaluated to 

determine the quality and suitability of such preparations for certain downstream applications. For 
example, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, 
preferably 5-10 mg specific antibody/ml, is generally employed in procedures requiring precipitation 
of PP-antibody complexes. Procedures for evaluating antibody specificity, titer, and avidity, and 

25 guidelines for antibody quality and usage in various applications, are generally available. (See, e.g., 
Catty, supra, and Coligan et al. supra .) 

In another embodiment of the invention, the polynucleotides encoding PP, or any fragment or 
complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 

30 RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 
PP. Such technology is well known in the art, and antisense oligonucleotides or larger firagments can 
be designed fit>m various locations along the coding or control regions of sequences encoding PP. 
(See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press Inc., Totawa NJ.) 
In therapeutic use, any gene delivery system suitable for introduction of the antisense 

35 sequences mto appropriate target cells can be used. Antisense sequences can be delivered 
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intracellularly in the fonn of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence racoding the target protein. (See, e.g., 
Slater, JJE. et al. (1998) J. Allergy Clin. Immunol. 102(3):469-475; and Scanlon. KJ. et al. (1995) 
9(13):1288-1296.) Antisense sequences can also be introduced intracellularly through the use of viral 

5 vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g.. Miller, A.D. (1990) Blood 
76:271; Ausubel, supra : Uckert, W. and W. Walthw (1994) Pharmacol. Ther. 63(3):323-347.) Other 
gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other 
systems known in the art. (See. e,g., Rossi. J.J. (1995) Br. Med. Bull. 51(l):217-225; Boado, R.J. et 
al. (1998) J. Pharm. ScL 87(11): 1308-13 15; and Morris, M.C et al. (1997) Nucleic Acids Res. 

10 25(14):2730-2736.) 

In anodier embodiment of the invention, polynucleotides encoding PP may be used for 
somatic or gennline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of sevm combined immunodeficiency (SCID)-X1 disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 

15 immunodeficiency syndrome associated with an inherited adenosme deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C et al, (1995) Science 270:470-475), 
cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 
Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), flialassamias, familial 
hypercholesterolemia, and hemophilia resulting from Factor VHI or Factor IX deficiencies (Crystal, 

20 R.G. (1995) Science 270:404-410; Verma, I.M. and N. Somia (1997) Nature 389:239-242)), (ii) 

express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
cell proliferation), or (iii) express a protein which affords protection against intracellular parasites 
(e.g., against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. 
(1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Nati. Acad. Sci. USA. 93:11395-11399), 

25 hepatitis B or C virus (HBV. HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis : and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruziV la the 
case where a genetic deficiency in PP expression or regulation causes disease, the expression of PP 
fi-om an appropriate population of transduced cells may alleviate the clinical manifestations caused by 
the genetic deficiency, 

30 In a further embodiment of the invention, diseases or disorders caused by deficiencies in PP 

are treated by constructing mammalian expression vectors encoding PP and introducing these vectors 
by mechanical means into PP-deficient cells. Mechanical transfer technologies for use with cells in 
vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic gold particle 
delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the use of 

35 DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. Biochem. 62:191-217; Ivies, 

45 



wo 02/10363 



PCTAJSOl/23716 



Z. (1997) Cell 91:501-510; Boulay, J-L, and H. R6cipon (1998) Curr. Opin. BiotechBol. 9:445^50). 

Expression vectors that may be effective for the expression of PP include, but are not limited 
to. the PCDNA 3.1, EPITAG. PRCCMV2, PREP, PVAX, PCR2-T0P0TA vectors (Invitrogen, 
Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PBGSH/PERV (Stratagene, La JoUa CA), and 

5 PTET-OFF. PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech. Palo Alto CA). PP may be 
expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous 
sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or p-actin genes), (ii) an inducible 
promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. Natl. 
Acad. Sci. USA 89:5547-5551; Gossen, M. et al. (1995) Science 268: 1766-1769; Rossi, RM.V. and 

10 H.M. Blau (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid 
(Invitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; 
Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter 
(Rossi, F.M.V. and Blau, RM. supra) \ or (iii) a tissue-specific promoter or the native promoter of the 
endogenous gene encoding PP from a normal individual. 

15 Commercially available liposome transformation kits (e.g., the PERFECT LIPID 

TRANSFECnON KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 
(Graham, FX. and A.J. Eb (1973) Virology 52:456^67), or by electroporation (Neumann, E. et al. 

20 (1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of 
these standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to PP expression are treated by constructing a retrovirus vector consisting of (i) the 
polynucleotide encoding PP under the control of an independent promoter or the retrovirus long 

25 terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus cis-^cting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PPBNEO) are 
commercially available (Stratagene) and ate based on published data (Riviere, L et al. (1995) Proc. 
Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is p-opagated in 

30 an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
receptors on the target cells or a promiscuous envelope protem such as VSVg (Armentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and 
A.D. MiUer (1988) J. Virol. 62:3802-3806; DuU, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. 
et al. (1998) J. Virol. 72:9873-9880). U.S. Patent Number 5,910,434 to Rigg ('^Method for obtaining 

35 retrovirus packaging cell imes producing high transducing efficiency retroviral supernatant") 
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discloses a method for obtaining retroviras packaging cell lines and is hereby incorporated by 
reference. Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4* T- 
cells), and the return of transduced cells to a patient are procedures well known to persons skilled in 
the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020- 
5 7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, ML. (1997) J. Viiol. 71:4707-4716; 
Ranga, U. et al. (1998) Proc. Nafl. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283- 
2290). 

In the alternative, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encodmg PP to cells which have one or more genetic abnonnalities with respect to 

10 the expression of PP. The construction and packaging of adenovirus-based vectors are well known to 
those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be 
versatile for importing genes encoding mununoregulatory proteins into mtact islets m the pancreas 
(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent Number 5,707,618 to Annentano ("Adenovirus vectors for gene therapy"), 

15 hereby incciporated by reference. For adenoviral vectors, see also Antinozzi, P,A. et al. (1999) 
Annu. Rev. Nutr. 19:51 1-544 and Verma. I.M. and N. Somia (1997) Nature 18:389:239-242, both 
incorporated by reference herein. 

In another alternative, a herpes-based, gene therapy delivery system is used to deliver 
polynucleotides encoding PP to target cells which have one or more genetic abnormalities with 

20 respect to the expression of PP. The use of herpes sinqplex virus (HSV)-based vectors may be 
especially valuable for introducing PP to cells of the central nervous system, for which HSV has a 
tropism. The construction and packaging of herpes-based vectors are well known to those with 
oitiinary skill in the art. A replication-con]t>etent herpes sunplex virus (HSV) type 1-based vector has 
been used to deliver a reporter gene to the eyes of prunates (Liu, X. et al. (1999) Exp. Eye Res. 

25 169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent Number 5,804,413 to DeLuca ("Herpes simplex virus strains for gene transfer"), which is 
hereby incorporated by reference. U.S. Patent Number 5,804,413 teaches the use of recombinant 
HSV d92 which consists of a genome containing at least one exogenous gene to be transferred to a 
cell under the control of the appropriate promoter for purposes including human gene therapy. Also 

30 taught by this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 
and ICP22. For HSV vectors, see also Coins, W.F. et al. (1999) J. Virol. 73:519-532 and Xu, H. et al. 
(1994) Dev. Biol. 163:152-161, hereby incorporated by reference. The manipulation of cloned 
herpesvirus sequences, the generation of recombinant virus following the transfection of multiple 
plasmids containing different segments of the large herpesvirus genomes, the growth and propagation 

35 of herpesvims, and the kifection of cells with herpesvirus are techniques well known to those of 
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ordinary skill in the art 

In another alternative, an alphavirus (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding PP to target cells. The biology of the prototypic alphavkus, Semlild 
Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based on the 

5 SFV genome (Garoff, H. and K. J. Li (1998) Curr. 0pm. Biotechnol. 9:464-469). During alphavmis 
RNA replication, a subgenomic RNA is generated that normally encodes the vkal capsid protems. 
This subgenomic RNA replicates to higher levels than the full length genomic RNA, resulting in the 
overproduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g., protease 
and polymerase). Similarly, inserting the coding sequence for PP mto die alphavhiis genome in place 

10 of the capsid-coding region results in the production of a large number of PP-coding RNAs and the 
synthesis of high levels of PP m vector transduced cells. While alphavirus infection is typically 
associated with cell lysis within a few days, the ability to establish a persistent infection m hamster 
normal kidney cells (BHK-21) with a variant of Sindbis vurus (SIN) mdicates that the lytic replication 
of alphavunses can be altered to suit the needs of the gene therapy application (Dryga, S.A. et al. 

15 (1997) Virology 228:74-83). The wide host range of alphaviruses will allow the introduction of PP 
into a variety of cell types. The specific transduction of a subset of cells in a population may require 
the sorting of cells prior to transduction. The methods of manipulating mfectious cDNA clones of 
alphavkuses, performing alphavirus cDNA and RNA transfections, and performmg alphavirus 
infections, are well known to those with ordinary skill in the art. 

20 Oligonucleotides derived from the transcription initiation site, e.g., between about positions 

-10 and +10 from the start site, may also be employed to inhibit gene expression. Similarly, 
mhibition can be achieved usmg triple helix base-pairing methodology. Triple helix pairing is useful 
because it causes inhibition of the ability of the double helix to open sufficientiy for the binding of 
polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using 

25 triplex DNA have been described m the literature. (See, e.g.. Gee, J.E. et al. (1994) in Huber, B.E. 
and B.L Carr, Molecular a nd Tmrmm ologic Approaches , Futura Publishing, Mt. Kisco NY, pp. 163- 
177.) A complementary sequence or antisense molecule may also be designed to block translation of 
mRNA by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 

30 RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to conq)lementary target RNA, followed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficiendy catalyze 
endonucleolytic cleavage of sequences encoding PP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 

35 scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
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GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 
corresponding to the region of the target gene contaming the cleavage site, may be evaluated for 
secondary structural features which may render the oligonucleotide moperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 

5 oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes of the invention may be prepared 
by any method known in the art for the synthesis of nucleic acid molecules. These mclude techniques 
for chemically synthesizing oligonucleotides such as solid phase phosphoramidite ctemical synthesis. 
Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA 

10 sequences encoding PP. Such DNA sequences may be incorporated into a wide variety of vectors 
with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, ehese cDNA constructs 
that synthesize conqplementary RNA, constitutively or inducibly, can be introduced into cell lines, 
cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
15 modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' 
ends of the molecule, or the use of phosphorothioate or 2* O-methyl rather than phosphodiesterase 
linkages within the backbone of the molecule. This concept is inherent in the production of PNAs 
and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, 
queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, 
20 cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous 
endonucleases. 

An additional embodiment of the invention encompasses a method for screening for a 
compound which is effective in altering expression of a polynucleotide encoding PP. Compounds 
which may be effective in altering expression of a specific polynucleotide may include, but are not 

25 limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which are capable of interacting with specific polynucleotide sequences. Effective 
compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 
polynucleotide expression. Thus, in the treatment of disorders associated with increased PP 

30 expression or activity, a compound which specifically inhibits expression of the polynucleotide 
encoding PP may be therapeutically useful, and in the treatment of disorders associated with 
decreased PP expression or activity, a compound which specifically promotes expression of the 
polynucleotide encoding PP may be therapeuticaUy useful ^ 
At least one, and up to a plurality, of test compounds may be screened for effectiveness in 

35 altering expression of a specific polynucleotide. A test compound may be obtained by any method 
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coimnonly known in the art, including chemical modification of a compound known to be effective in 
altering polynucleotide expression; selection from an existing, commercially-available or proprietary 
library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 

5 library of chemical compounds created cximbinatorially or randomly. A sample comprising a 

polynucleotide encoding PP is exposed to at least one test compound thus obtained. The sample may 
comprise, for example, an intact or penneabilized cell, or an in vitro cell-free or reconstituted 
biochenucal system. Alterations in the ejqpression of a polynucleotide encoding PP are assayed by 
any method commoidy known in the art. Typically, the expression of a specific nucleotide is 

10 detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of the polynucleotide encoding PP. The amount of hybridization may be quantified, thus forming the 
basis for a conq)arison of the expression of the polynucleotide both with and without exposure to one 
or more test compounds. Detection of a change in the expression of a poljrnucleotide exposed to a 
test compound mdicates that the test compound is effective in altering the expression of the 

15 polynucleotide. A screen for a conq)ound effective in altering expression of a specific polynucleotide 
can be carried out, for example, using a Schizosaccharomvces pombe gene expression system 
(Atkms, D. et al. (1999) U.S. Patent No. 5,932,435; Amdt, G.M. et al (2000) Nucleic Acids Res. 
28:E15) or a human cell line such as HeLa cell (Qarke, MJL. et al. (2000) Biochem. Biophys. Res. 
Commun. 268:8-13). A particular embodiment of the present invention involves screening a 
' 20 combinatorial library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide 
nucleic acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide 
sequence (Bruice, T.W. et al. (1997) U.S. Patent No. 5,686,242; Bruice, T.W. et al. (2000) U.S. 
Patent No. 6,022,691). 

Many methods for introducing vectors uito cells or tissues are available and equally suitable 
25 for use in vivo, in vitro , and ex vivo . For ex vivo therapy, vectors may be introduced into stem cells 
taken from the patient and clonally propagated for autologous transplant back into that same patient 
Delivery by transfection, by liposome injections, or by polycationic amino polymers may be achieved 
using methods which are well known in the art. (See, e.g., Goldman, C.K. et al. (1997) Nat. 
Biotechnol. 15:462-466.) 

30 Any of the therapeutic methods described above may be applied to any subject in need of 

such therapy, including, for example, manamals such as humans, dogs, cats, cows, horses, rabbits, and 
monkeys. 

An additional embodiment of the invention relates to the administration of a composition 
which generally conoprises an active ingredient formulated with a pharmaceutically acceptable 
35 excipient. Excipients may include, for example, sugars, starches, celluloses, gums, and proteins. 
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Various formulations arc commonly known and are thoroughly discussed in the latest edition of 
RftmiTiptnn's Pharmaceutical Sciences fMaack Publishing, EastonPA). Such con^ositions may 
consist of PP, antibodies to PP, and mimetics. agonists, antagonists, or inhibitors of PP. 

The compositions utilized in this invention may be administered by any number of routes 
5 including, but not limited to. oral, intravenous, intramuscular, intra-arterial. intramedullary, 
intrathecal, mtraventricular. puhnonary, transdennal, subcutaneous, intraperitoneal, intrauasal, 
enteral, topical, sublingual, or rectal meaiis. 

Con?)ositions for pulmonary admmistration may be prepared m liquid or dry powder form. 
These con?)ositions arc generally aerosolized hmnediately prior to inhalation by the patient In the 
10 case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol deUvery of 
fast-actmg formulations is well-known in the art. hi die case of macromolecules (e.g. larger peptides 
and protems), recent developments m the field of puhnonary deKvery via the alveolar region of the 
lung have enabled the practical deUvery of drugs such as insulm to blood circulation (see, e.g., Patton, 
J.S. et al., U.S. Patent No. 5,997,848). PuhncHiary deUvery has the advantage of administration 
15 without needle injection, and obviates the need for potentially toxic penetration enhancers. 

Conq)Ositions suitable for use in the invention include compositions wherein the active 
ingredients are contamed in an effective amount to achieve the intended purpose. The determination 
of an effective dose is well within the capabihty of those skilled in the art. 

Specialized forms of compositions may be prepared for direct intracellular delivery of 
20 macromolecules comprismg PP or fragments thereof. For example, liposome preparations containing 
a cell-iii]$)ermeable macromolecule may promote cell fusion and intracellular delivery of tiie 
macromolecule. Alternatively, PP or a fragment thereof may be joined to a short cationic N-termmal 
portion from the HIV Tat-1 protein. Fusion proteins tiius generated have been found to transduce into 
tile cells of all tissues, including tiie brain, in a mouse model system (Schwarze, S.R. et al. (1999) 
25 Science 285:1569-1572). 

For any compound, tiie tiierapeutically effective dose can be estimated initially eittier m cell 
culture assays, e.g., of neoplastic cells, or m animal models such as mice, rats, rabbits, dogs, 
monkeys, or pigs. An ammal model may also be used to determine the appropriate concentration 
range and route of administration. Such information can tiien be used to determme useful doses and 
30 routes for administration in humans. 

A tiierapeutically effective dose refers to that amount of active iugredient, for example PP or 
fragments tiiereof , antibodies of PP, and agonists, antagonists or inhibitors of PP, which ameUorates 
the symptoms or condition. Therapeutic efficacy and toxicity may be determined by standard 
pharmaceutical procedures m cell cultures or wifli e3q)erimental animals, such as by calculating tfie 
35 ED50 (the dose therapeutically effective in 50% of tiie population) or LD50 (tiie dose letiial to 50% of 
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the population) statistics. The dose ratio of toxic to therapeutic effects is the therapeutic index, which 
can be expressed as the LD30/ED50 ratio. Compositions which exhibit large therapeutic indices are 
preferred. The data obtained from cell culture assays and animal studies are used to formulate a range 
of dosage for human use. The dosage contained in such compositions is preferably within a range of 
5 circulating concentrations that includes the ED50 with litfle or no toxicity. The dosage varies within 
this range depending upon the dosage form employed, the sensitivity of the patient, and the route of 
administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the 
subject requiring treatment Dosage and administration are adjusted to provide sufficient levels of the 

10 active moiety or to mamtain the desired effect. Factors which may be taken into account include the 
severity of the disease state, the general health of the subject, the age, weight, a^d gender of the 
subject, time and firequmcy of administration, drug combination(s), reaction sensitivities, and 
lesponse to therapy. Long-acting conqpositions may be administered every 3 to 4 days, every week, 
or biweekly depending on the half-life and clearance rate of the particular formulation. 

15 Normal dosage amounts may vary from about 0. 1 Mg to 100,000 izg, up to a total dose of 

■ about 1 gram, depending upon the route of aAmnistration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art 
Those skUled in the art will en[q)loy different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 

20 conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bind PP may be used for the diagnosis 
of disorders characterized by expression of PP, or in assays to monitor patients being treated with PP 
or agonists, antagonists, or inhibitors of PP. Antibodies usefiil for diagnostic purposes may be 

25 prepared in the same manner as described above for therapeutics. Diagnostic assays for PP mclude 
methods which utilize the antibody and a label to detect PP in human body fluids or in extracts of 
cells or tissues. The antibodies may be used with or without modification, and may be labeled by 
covalent or non-covalent attachment of a reporter molecule. A wide variety of reporter molecules, 
several of which are described above, are known in the art and may be used. 

30 A variety of protocols for measuring PP, including ELK As, RIAs, and FACS, are known in 

the art and provide a basis for diagnosing altered or abnormal levels of PP expression. Normal or 
standard values for PP expression are established by combining body fluids or ceU extracts taken 
from normal nfianraialian subjects, for example, human subjects, with antibodies to PP under 
conditions suitable for conq}lex formation. The amount of standard complex formation may be 

35 quantitated by various methods, such as photometric means. Quantities of PP expressed m subject. 
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control, and disease samples from biopsied tissues are compared wifli the standard values. Deviation 
between standard and subject values establishes the paiameteis for diagnosing disease. 

In another embodiment of the invention, the polynucleotides encoding PP may be used for 
diagnostic purposes. The polynucleotides which may be used include oligonucleotide sequences. 
5 complementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect 
and quantify gene expression in biopsied tissues in which expression of PP may be correlated with 
disease. The diagnostic assay may be used to determme absence, presence, and excess expression of 
PP. and to monitor regulation of PP levels durii^ therapeutic intervention. 

In one aspect, hybridization with VCR. probes which are capable of detecting polynucleotide 
10 sequences, mcluding genomic sequences, encoding PP or closely related molecules may be used to 
identify nucleic acid sequences which encode PP. The specificity of the probe, whelher it is made 
ftomahighly specific region, e.g., the 5' regulatory region, orfixmia less specific region, e.g., a 
conserved motif, and the stringency of the hybridization or amplification will determine whether the 
probe identifies only naturaUy occurring sequences encoding PP. alleUc variants, or related 
IS sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
sequence identity to any of the PP encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:11-20 or from 
genomic sequences including promoters, enhancers, and inttiMis of the PP gene. 
20 Means for producing specific hybridization probes for DNAs encoding PP include the cloning 

of polynucleotide sequences encoding PP or PP derivatives into vectors for the production of mRNA 
probes. Such vectors are known'in the art, are commeicially avaUable, and may be used to synthesize 
RNA probes invilio by means of the addition of the appropriate RNA polymerases and the 

■ appropriate labeled nucleotides. Hybridization probes may be labeled by a variety of reporter groups, 

25 for example, by radionucUdes such as ^ or ''S, or by enzymatic labels, such as alkaline phosphatase 
coiq>led to the probe via avidin/biotin couplmg systems, and the like. 

Polynucleotide sequences encoding PP may be used for die diagnosis of disorders associated 
with expression of PP. Examples of such disorders include, but are not limited to, an immune system 
disorder, such as acquired immunodeficiency syndrome (AIDS), X-linked agammaglobinemia of 

30 Bruton. common variable immunodeficiency (CVD. DiGeorge's syndrome (tiiymic hypoplasia), 
thymic dysplasia, isolated IgA deficiency, severe combined immunodeficiency disease (SCID), 
unmunodeficiency with thrombocytopenia and eczema (Wiskott-Aldrich syndrome), Chediak-Higashi 
syndrome, chronic granulomatous diseases, hereditary angioneurotic edema, immunodeficiency 
associated wifli Cushing's disease, Addison's disease, adult respiratory distress syndrome. aUergies, 

35 ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, 
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autoimmune thyroiditis, autoimmune polyendocrinopathy-candidiasis-ectodeimal dystrophy 
(APECEP), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, 
dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lynq)hocytotoxins, 
erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's 

5 syndrome, gout, Graves' disease, Hashimoto's thyroiditis, hypereosmophilia, irritable bowel 
syndrome, multiple sclerosis, myasthenia gravis, myocardial or pericardial ioflammation, 
osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid 
arthritis, scleroderma, Sjdgren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, 
systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, 

10 complications of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, fungal, 

parasitic, protozoal, and hekninthic infections, and trauma; a neurological disorder, such as epilepsy, 
ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease. Pick's disease, 
Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic 
lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retmitis 

15 pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinatmg diseases, bacterial and 
viral menmgitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Schemker syndrome, fatal 
familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 

20 tuberous sclerosis, cerebelloretiaal hemangioblastomatosis, encephalotrigeminal syndrome, mental 
retardation and other developmental disorders of the central nervous system uicluding Down 
syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve 
disorders, spinal cord diseases, muscular dystrophy and ctiier neuromuscular disorders, peripheral 
nervous system disorders, dennatomyositis and polymyositis, inherited, metabolic, endocrine, and 

25 toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, 
and schizophrraic disorders, seasonal affective disorder (SAD), akathesia, anmesia, catatonia, 
diabetic neuropathy, tardive dyskinesia, dystcmias, paranoid psychoses, postherpetic neuralgia, 
Touiette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial 
ftontoten^Kjral dementia; a developmental disorder, such as renal tubular acidosis, anemia. Gushing* s 

30 syndrome, achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal 
dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental 
retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary muooepithelial 
dysplasia, hereditary keratodennas, hereditary neuropathies such as Charcot-Marie-Tooth disease and 
neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham*s chorea 

35 and cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and 
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sensorineural hearing loss; and a cell proliferative disorder, such as actinic keratosis, arteriosclerosis, 
atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, 
paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and 
cancers including adenocarcmoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, 

5 teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, 
breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, 
pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus. 
The polynucleotide sequences encoding PP may be used in Southern or northem analysis, dot blot, or 
other membrane-based technologies; in PGR technologies; in dipstick, pin, and multiformat ELISA- 

10 like assays; Jmd in microarrays utilizing fluids or tissues from patients to detect altered PP expression. 
Such qualitative or quantitative methods are well known in the art. 

In a particular aspect, the nucleotide sequences encoding PP may be useful in assays that 
detect the presence of associated disorders, particiQarly those mentioned above. The nucleotide 
sequences encoding PP may„be labeled by standard mediods and added to a fluid or tissue sample 

15 from a patient under conditions suitable for the formation of hybridization complexes. After a 
suitable incubation period, the sample is washed and the signal is quantified and con^)ared with a 
© "TO$© -£g^^^vvfiJ)WJympfoma^ 



30 presence of a disorder. 

Once the presence of a disorder is established and a treatment protocol is initiated, 
hybridization assays may be r^eated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject. The results obtained from 
successive assays may be used to show the efiBcacy of treatment over a period ranging frona several 

35 days to months. 
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With respect to cancer, the presence of an abnonnal amount of transcript (either under- or 
overexpressed) in biopsied tissue from an individual may indicate a predisposition for the 
development of the disease, or may provide a means for detectmg the disease prior to the appearance 
of actual clinical synq)toms. A more definitive diagnosis of this type may allow health professionals 

5 to employ preventative measures or aggressive treatment earlier thereby preventing the development 
or furtho: progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the sequences encoding PP 
may involve the use of PGR. These oligomers may be chemically synthesized, generated 
enzymatically, or produced in vitro . Oligomers will preferably contam a fragment of a polynucleotide 

10 encoding PP, or a fragment of a polynucleotide complementary to the polynucleotide encoding PP, 
and will be employed under optimized conditions for identification of a specific gene or condition. 
Oligomers may also be employed under less stringent conditions for detection or quantification of 
closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived fix)m the polynucleotide sequences 

15 encoding PP may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, 
insertions and deletions that are a frequent cause of inherited or acquired genetic disease m humans. 
Methods of SNP detection include, but are not limited to, single-stranded conformation 
polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP. oligonucleotide primers . 
derived from tiie polynucleotide sequences encoding PP are used to amplify DNA using the 

20 polymerase chain reaction (PGR). The DNA may be derived, for example^ from diseased or normal 
tissue, biopsy samples, bodily fluids, and tiie like. SNPs in the DNA cause differences in the 
secondary and tertiary structures of PGR products m single-stranded form, and these differences are 
detectable using gel electrophoresis m non-denaturing gels. In fSCCP, the oligonucleotide primers 
are fluorescentiy labeled, which allows detection of the amplimers in high-throughput equipment such 

25 as DNA sequencing machmes. Additionally, sequence database analysis methods, termed in silico 
SNP (isSNP). are capable of identifying polymorphisms by comparing the sequence of individual 
overlapping DNA fragments which assemble into a common consensus sequence. TTiese computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 
errors using statistical models and automated analyses of DNA sequence chromatograms. In the 

30 alternative, SNPs may be detected and characterized by mass spectrometiy using, for example, the 
high throughput MASSARRAY system (Sequenom, Inc., San Diego CA). 

Methods which may also be used to quantify the expression of PP mclude radiolabeling or 
biotinylating nucleotides, coanq)lification of a control nucleic acid, and inteipolating results bom 
standard curves. (See, e.g., Melby, P.C. et al. (1993) J. Immunol. Methods 159:235-244; Duplaa, C. 

35 et al. (1993) Anal. Biochemi 212:229-236.) The speed of quantitation of multiple samples may be 
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accelerated by running the assay in a high-throughput format where the oligomer or polynucleotide of 
interest is presented in various dilutions and a spectrophotometric or coloiimetric response gives 
rapid quantitation. 

In further ranbodiments, oligonucleotides or longer fragments dmved from any of the 

5 polynucleotide sequences described herein may be used as elements on a microarray. The microarray 
can be used in transcript imaging techniques which monitor the relative expression levels of large 
numbers of genes simultaneously as described below. The microarray may also be used to identify 
genetic variants, mutations, and polymorphisms. This information may be used to detennine gene 
function, to understand the genetic basis of a disoid^, to diagnose a disorder, to monitor 

10 progression/regression of disease as a function of gene expression, and to develop and monitor the 
activities of therapeutic agents in the treatment of disease. In particular, this information may be used 
to develop a phannacogenomic profile of a patient in order to select the most appropriate and 
effective treatment reginaen for that patient. For example, therapeutic agents which are highly 
effective and display the fewest side effects may be selected for a patient based on his/her 

15 phannacogenomic profile. 

In another embodmient, PP, fragments of PP, or antibodies specific for PP may be used as 
elements on a microarray. The microarray may be used to monitor or measure protein-protein 
interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present mvention to 

20 generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
quantifying the number of expressed genes and their relative abundance under given conditions and at 
a given time. (See Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent Number 
5,840,484, expressly incorporated by reference herein.) Thus a transcript image may be generated by 

25 hybridizmg the polynucleotides of the present invention or their con5)lements to the totality of 
transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
hybridization takes place in high-throughput format, wherem the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript image would provide a profile of gene activity. 

30 Transcript images may be generated using transodpts isolated from tissues, cell lines, 

biopsies, or other biological sairqples. The transcript knage may thus reflect gene expression in vivo, 
as in the case of a tissue or biopsy sanqple, or in vitro, as in the case of a cell line. 

Transcript im^es which profile the expression of the polynucleotides of the present 
invention may also be used in conjunction with in vitro model systems and preclinical evaluation of 

35 pharmaceuticals, as well as toxicological testmg of industrial and naturally-occurring environmental 
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compounds. All coirpounds induce characteristic gene expression patterns, frequently termed 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and 
toxicity (Nuwaysir, EJF. et al. (1999) Mol. Carcinog. 24:153-159; Steiner, S. and N.L. Anderson 
(2000) Toxicol. Lett 112-113:467-471, expressly mcorporated by reference herein). If a test 
compound has a signature similar to that of a compound with known toxicity, it is likely to share 
those toxic properties. These fingerprints or signatures are most useful and refined when they contain 
expression information from a large number of genes and gene families. Ideally, a genome-wide 
measurement of expression provides the highest quaUty signature. Even genes whose expression is 
not altered by any tested coBopounds are unportant as well, as the levels of expression of these genes 
are used to normalize the rest of the expression data. The normaUzation procedure is useful for 
comparison of expression data after treatment with different compounds. While the assignment of 
gene function to elements of a toxicant signature aids in interpretation of toxicity mechanisms, 
knowledge of gene function is not necessary for the statistical matching of signatures which leads to 
prediction of toxicity. (See, for example. Press Release 00-02 from the National Institute of 
Environmental Health Sciences, released February 29, 2000, available at 
http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is in^ortant and desirable in 
toxicological screening using toxicant signatures to include all expressed gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treatmg a biological 
sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the 
treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of the present 
invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences m the transcript levels between the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 

Another particular embodiment relates to the use of the polypeptide sequences of the present 
invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global 
pattern of protein expression in a particular tissue or cell type. Each protem component of a 
proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, 
are analyzed by quantifying the number of expressed protems and their relative abundance under 
given conditions and at a given tune. A profile of a cell's proteome may thus be generated by 
separating and analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the 
separation is achieved using two-dimensional gel electrophoresis, in which protems from a san5>le are 
separated by isoelectric focusmg m the first dimension, and then according to molecular weight by 
sodium dodecyl sulfate slab gel electrophoresis in the second dunension (Sterner and Anderson, 
supray The protems are visualized in the gel as discrete and uniquely positioned spots, typically by 
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Staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains. The optical 
density of each protein spot is generally proportional to the level of the protein in the sample. The 
optical densities of equivalently positioned protein spots from different sanoples, for example, from 
biological samples either treated or untreated with a test compound or therapeutic agent, are 

5 compared to identify any changes in protein spot density related to the treatment. The proteins in the 
spots are partially sequenced using, for example, standard methods en^loying chemical or enzymatic 
cleavage followed by mass spectrometry. The identity of the protein in a spot may be determmed by 
con5)aring its partial sequence, preferably of at least 5 contiguous amino acid residues, to the 
polypeptide sequences of the present invention. In some cases, further sequence data may be 

10 obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for PP to quantify the 
levels of PP expression. In one embodiment, the antibodies are used as elements on a microarray, and 
protein expression levels are quantified by exposing the microarray to the sample and detecting the 
levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103-111; 

15 Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed by a variety of 
mediods known in the art, for example, by reacting the proteins in the sample with a thiol- or amino- 
reactive fluorescent conqpound and detecting the amount of fluorescence bound at each array element 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor 

20 correlation between transcript and protem abundances for some proteins in sonoe tissues (Anderson, 
NJL. and L Seilhanaer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significandy affect the transcript inoage, but which 
alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to 
rapid degradation of mRNA, so proteomic profiling may be more reliable and informative in such 

25 cases. 

In another embodimi^t, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test confound. Proteins that are expressed in the treated 
biological sample axe separated so that the amount of each protein can be quantified. Tbe amount of 
each protein is compared to the amount of the corresponding protein in an untreated biological 

30 sanq)le. A dtBb:ence in the amount of protein between the two sanq)les is indicative of a toxic 

response to the test compound in the treated sample. Individual protems are identified by sequencing 
the amino acid residues of the individual proteins and coiiq>aring these partial sequences to the 
polypeptides of the present invention. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 

35 sample containing proteins with the test conq)ound. Proteins from the biological sanq)le are 
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incubated with antibodies specific to the polypeptides of the present invention. The amount of 
protein recognized by the antibodies is quantified. The amount of protein in the treated biolo^cal 
sani^le is conq)ared with the amount m an untreated biological sample. A difference in the amount of 
protein between the two samples is indicative of a toxic response to the test conq)Ound in the treated 
5 sai][q)le. 

Microarrays may be prepared, used, and analyzed using methods known in the art. (See, e.g., 
Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Nad. Acad. Sci. 
USA 93:10614-10619; Baldeschweiler et al. (1995) PCT appUcation W095/25 11 16; Shalon, D. et aL 
(1995) PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. Nad. Acad. Sci. USA 94:2150- 

10 2155; and Heller, MJ. et al. (1997) U.S. Patent No. 5,605.662.) Various types of microarrays are 
well known and thoroughly described in DNA Microarrays: A Practical Annroach . M. Schena, ed. 
(1999) Oxford University Press, London, hereby expressly incorporated by reference. 

In another embodiment of the invention, nucleic acid sequences encoding PP may be used to 
generate hybridization probes useful m mapping the naturally occurring genomic sequence. Either 
■ 15 coding or noncoding sequences may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 
of a multi-gene family may potentially cause undesired cross hybridization during chromosonfial 
mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 
chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), 

20 yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 

constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, IJ. et al. (1997) Nat 
Genet 15:345-355; Price, CM. (1993) Blood Rev, 7:127-134; and Trask, B J. (1991) Trends Genet 
7: 149-154.) Once mapped, the nucleic acid sequences of the invention may be used to develop 
genetic linkage maps, for example, which correlate the inheritance of a disease state with the 

25 inheritance of a particular chromosome region or restriction fragment length polymorphism (RFLP). 
(See, for example. Lander, E,S, and D. Botstein (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357.) 

Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic 
map data. (See, e.g., Heinz-Ulrich, et al. (1995) in Meyers, s^ra, pp. 965-968.) Exan5)les of genetic 
map data can be found in various scientific journals or at the Online Mendelian Inheritance in Man 

30 (OMIM) World Wide Web site. Conelation between tiie location of tiie gene encoding PP on a 
physical map and a specific disorder, or a predisposition to a specific disorder, may help define the 
region of DNA associated vdth that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as 
linkage analysis using established chromosomal markers, may be used for extending genetic maps. 

35 Often the placement of a gene on the chromosome of another mammalian species, such as mouse, 
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may reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 
localized by genetic linkage to a particular genomic region, e.g., ataxia4elangiectasia to llq22-23, 

5 any sequences mapping to that area may represent associated or regulatory genes for further 

mvestigation. (See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide sequence of 
the instant invention may also be used to detect differences in the chromosomal location due to 
translocation, inversion, etc., among normal, carrier, or affected individuals. 

In another embodiment of the mvention, PP, its catalytic or immunogenic firagments, or 

10 oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such screening may be free ui solution, affixed to a 
solid support, borne on a cell surface, or located intiacellularly. The formation of bmding complexes 
between PP and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of comqpounds 

15 having suitable binding affinity to the protein of interest (See, e.g., Geysen, et aL (1984) PCT 
application WO84/03564.) Jn Ms method, large numbers of different small test con?)ounds are 
synthesized on a solid substrate. The test compounds are reacted with PP, or fragments thereof, and 
washed. BoundPPisthendetectedby methods well known in the art. Purified PP can also be coated 
directly onto plates for use in the aforementioned drug screemng techniques. AltOTiatively, 

20 non-neutrali2ing antibodies can be used to capture the peptide and immobilize it on a solid support. 

In another embodiment, one may use competitive drug screemng assays m which neutralizmg 
antibodies capable of binding PP specifically compete with a test con?)ound for binding PP. In this 
manner, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with PP. 

25 In additional embodiments, the nucleotide sequences which encode PP may be used in any 

molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are curcentiy known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair interactions. 

Without further elaboration, it is believed that one skilled in the art can, usmg the preceding 

30 description, utilize the present mvention to its fidlest extent. The following embodiments are, 

therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure 
in any way whatsoever. 

The disclosures of all patents, applications and publications, mentioned above and below, 
including U.S. Ser. No. 60/221,679, U.S. Ser. No. 60/223.272, U.S. Ser. No. 60/224,309. U.S. Ser. 

35 No. 60/226,728, U.S. Ser, No. 60/229,254, and U.S. Ser. No. 60/231,366, are expressly incorporated 
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5 EXAMPLES 
1. Construction of cDNA Libraries 

Incyte cDNAs were derived from cDNA libraries described in the LIFESEQ GOLD database 
(Incyte Genomics, Palo Alto CA) and shown in Table 4, column 5. Some tissues were homogenized 
and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a 

10 suitable mixture of denaturants, such as TRIZOL (Life Technologies), a monophasic solution of 
phenol and guanidine isothiocyanate. The resultmg lysates were centrifuged over CsQ cushions or 
extracted with chloroform. RNA was precipitated from the lysates with either isopropanol or sodium 
acetate and ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were repeated as necessary t6 increase RNA 

15 purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was isolated 
using oligo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, 
Chatsworth CA), or an OUGOTEX mRNA purification kit (QIAGEN). Alternatively. RNA was 
isolated directly from tissue lysates using other RNA isolation kits, e.g.. the POLY(A)PURE mRNA 
purification kit (Ambion, Austin TX). 

20 In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 

libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed wifli the UNIZAP 
vector system (Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), using the 
recommended procedures or similar methods known m the art. (See, e.g., Ausubel, 1997, supra, units 
5.1-6.6.) Revise transcription was initiated using oligo d(T) or random primers. Synthetic 

25 oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the 
appropriate restriction enzyme or enzynaes. For most libraries, the cDNA was size-selected (300- 
1000 bp) using SEPHACRYL SIOOO, SEPHAROSE CL2B, or SEPHAROSE CL4B column 
chromatography (Amersham Phannacia Biotech) or preparative agarose gel electrophoresis. cDNAs 
were ligated into con^atible restriction enzyme sites of the polylmker of a suitable plasmid, e.g-, 

30 PBLUESCRIPT plasmid (Stratagene), PSPORTl plasmid (Life Technologies), PCDNA2. 1 plasmid 
(favitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), or pINCY (Incyte Genomics, Palo Alto 
CA), or derivatives thereof. Recombmant plasmids were transformed into competent E. coli cells 
including XLl-Blue, XLl-BlueMRF. or SOLR fiom Stratagene or DH5a, DHIDB, or BlectroMAX 
DHIOB from life Technologies. 

35 n. Isolation of cDNAQones 
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Plasmids obtained as described in Example I were recovered bom host ceUs by invivo 
excision using the UNEAP vector system (Stratagene) or by ceU lysis. Plasmids were purified using 
at least one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an 
AGTC Mmiprep purification kit (Edge Biosystems. Gaithersburg MD); and QIAWELL 8 Plasmid. 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E.A.L. PREP 96 
plasmid purification kit from QIAGEN. Followmg precipitation, plasmids were resuspended in 0.1 
ml of distilled water and stored, with or without lyophilization, at A'C. 

Alternatively, plasmid DNA was amplified from host cell lysates using direct link PGR in a 
high-throughput format (Rao, V.B. (1994) Anal. Biochem 216: 1-14). Host ceU lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
384-well plates, and the concentration of anq)Kfied plasmid DNA was quantified fluorometrically 
using nCOGREEN dye (Mdecuhir Probes. Eugene OR) and a FLUOROSKAN H fluorescence 
scanner (Labsystems Oy, Helsinki, Finland), 
m. Sequendng and Analyids 

Incyte cDNA lecoveied m plasmids as described in Example H were sequenced as follows. 
Sequendng reactions were processed using standard methods or high-throughput mstrmnentation 
such as the ABI CATALYST 800 (AppUed Biosystems) thermal cycler or the PTC-200 thermal 
•cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) Uquid transfer system. cDNA sequencmg reactions were prepared 
using reagents provided by Amersham Pharmacia Biotech or suppUed in ABI sequencing kits such as 
die ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 
Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 
were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the 
ABI PRISM 373 or 377 sequencing system (AppUed Biosystems) in conjunction with standard ABI 
protocols and base caUing software; or other sequence analysis systems fcaown in the art Reading 
frames witiiin the cDNA sequences were identified using standard methods (reviewed m Ausubel, 
1997. affim. unit 7.7). Some of the cDNA sequences were selected for extension usmg the techniques 
disclosed in Exanq)le Vm. 

The polynucleotide sequences derived from Incyte cDNAs were validated by removmg 
vector, Imker, and poly(A) sequences and by masking ambiguous bases, using algorithms and 
programs based on BLAST, dynamic programming, and dinucleotide neatest neighbor analysis. The 
Incyte cDNA sequences or translations tiiereof were then queried agamst a selection of public 
databases such as die GenBank primate, rodent, mammalian, vertebrate, andcukaiyote databases, and 
BLOCKS, PRINTS, DOMO, PRODOM, and hidden Markov model (HMM)-based protran family 
databases such as PFAM. (HMM is a probabUistic approach which analyzes consensus primary 
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Structures of gene famflies. See. for example, Eddy, S.R. (1996) Curt. Opin. Struct. Biol 6:361-365.) 
The queries were performed using programs based on BLAST, FASTA, BLIMPS, and HMMER. The 
Ihcyte cDNA sequences were assembled to produce fiill length polynucleotide sequences. 
Alternatively, GenBankcDNAs, GenBankESTs, stitched sequences, stretched sequences, or 
5 Genscan-predicted coding sequences (see Examples IV and V) were used to extend Incyte cDNA 
assemblages to full length. Assembly was performed using programs based on Phred, Phrap, and 
Consed, and cDNA assemblages were screened for open reading frames using programs based on 
GeneMark. BLAST, and FASTA. The full length polynucleotide sequences were translated to derive 
the corresponding full length polypeptide sequences. Alternatively, a polypeptide of the invention 
10 may begin at any of the methionine residues of the fiill length translated polypeptide. Full length 
polypeptide sequences were subsequenfly analyzed by querying against databases such as the 
GenBank protein databases (genpept), SwissProt, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, 
and hidden Markov model (HMM)-based protein family databases such as PFAM. Full length 
polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software 
15 Engineering. Soutii San Francisco CA) and LASERGENE software (DNASTAR). Polynucleotide 
and polypeptide sequence alignments are generated using default parameters specified by the 
CLUSTAL algorithm as incorporated into the MEGALIGN multisequence alignment program 
(DNASTAR), which also calculates the percent identity between aligned sequences. 

Table 7 summarizes tiie tools, programs, and algorithms used for die analysis and assembly of 
20 Incyte cDNA and full lengtii sequences and provides applicable descriptions, references, and 

tiireshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, 
the second column provides brief descriptions thereof, the third column presents appropriate 
references, all of which are incorporated by reference herein m their entirety, and the fourth column 
presents, where applicable, die scores, probabiUty values, and other parameters used to evaluate the 
25 strength of a match between two sequences (the higher the score or the lower the probability value, 
the greater the identity between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide 
and polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ 
ID NO: 1 1-20. Fragments fix>m about 20 to about 4000 nucleotides which are usefiil in hybridization 
30 and anaplification technologies are described ui Table 4, column 4. 

IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative protem phosphatases were mitially identified by running the Genscan gene 
identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is 
a general-purpose gene identification program which analyzes genomic DNA sequences fi:om a 
35 variety of organisms (See Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94. and Burge, C. and 
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S. Karlin (1998) Cvirr. Opin. Struct. Biol. 8:346-354). The program concatenates predicted exons to 
form an assembled cDNA sequence extending IBrom a methionine to a stop codon. The output of 
Qenscan is a FASTA database of polynucleotide and polypeptide sequences. The maximum range of 
sequence for Genscan to analyze at once was set to 30 kb. To determine which of these Genscan 

5 predicted cDNA sequences encode protem phosphatases, the encoded polypeptides were analyzed by 
querying against PFAM models for protein phosphatases. Potential protein phosphatases were also 
identified by homology to Incyte cDNA sequences that had been annotated as protem phosphatases. 
These selected Genscan-predicted sequences were tiien compared by BLAST analysis to the genpept 
and gbpri public databases. Where necessary, the Genscan-predicted sequences were then edited by 

10 comparison to the top BLAST hit from genpept to correct errors in the sequence predicted by 

Genscan, such as extra or omitted exons. BLAST analysis was also used to find any Inc>1:e cDNA or 
public cDNA coverage of the Genscan-predicted sequences, thus providing evidence for transcription. 
When hicyte cDNA coverage was available, this information was used to correct or confirm the 
Genscan predicted sequence. Full length polynucleotide sequences were obtained by assembling 

15 Genscan-predicted coding sequences with Incyte cDNA sequences and/or public cDNA sequences 
using the assembly process described in Example TJL. Altematively, full length polynucleotide 
sequences were derived entkely from edited or unedited Genscan-predicted coding sequences. 
V. Assembly of Genomic Sequence Data with cDNA Sequence Data 
"Stitched" Sequences 

20 Partial cDNA sequences were extended with exons predicted by the Genscan gene 

identification program described in Example IV. Partial cDNAs assembled as described in Exan5>le 
in were mapped to' genomic DNA and parsed mto clusters containmg related cDNAs and Genscan 
exon predictions from one or more genomic sequ«tices. Each cluster was analyzed using an algorithm 
based on graph theory and dynamic programming to mtegrate cDNA and genomic information, 

25 generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 
more than one sequence in the cluster were identified, and intervals thus identified were considered to 
be equivalent by transitivity. For exan^le, if an interval was present on a cDNA and two genomic 
sequences, then all thiee mtervals were considered to be equivalent. This process allows unrelated 

30 but consecutive genomic sequences to be brought together, bridged by cDNA sequaice. Intervals 
thus identified were then "stitched" together by the stitching algorithm in the order that they appear 
along their parent sequences to generate the longest possible sequence, as well as sequence variants. 
Linkages between mtervals which proceed along one type of parent sequaice (cDNA to cDNA or 
genomic sequence to genomic sequence) were given preference over linkages which change parent 

35 type (cDNA to genomic sequence). The resultant stitched sequences were translated and compared 
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. by BLAST analysis to the genpept and gbpri pubUc databases. Mcotrect CTons predicted by Gn 
were corrected by comparison to the top BLAST hit from genpepL Sequences were farther exl 
wilh additional cDNA sequences, or by inspection of genranic DNA, when necessary. 
"Stretched" Sequences 

Partial DNA sequences wate exteided to fall length with an algoriflmi based on BLAS 
analysis. First, partial cDNAs assembled as described in Example IH were queried against pul 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databas 
using tiie BLAST program. The nearest GenBank protein homolog was then conq)an!d by BL, 
analysis to dther hicyte cDNA sequences or GenScan exon predicted sequences described in 
» Example IV. A cWmeric protein was generated by using the resultant high-scoring segment pa 
(HSPs) to msq) the translated sequences raito die GenBank protem homolog. Insertions or dele 
may occur in the chimeric protein witii respect to die OTgmal GenBank protem homolog. The 
GenBank protem homolog, the chnneric protein, or botii were used as probes to search for 
homologous genomic sequences from tiie pubUc human genome databases. Partial DNA sequ* 
5 were tiierefore "stretched" or extaided by the addition of homologous genomic sequences. Th 
. resultant stretched sequaices were examined to deteimine whetiier it contained a con^lete ger 
VI. Chromosomal Mappk^ of PF Encoding Polynucleotides 

The sequences which w«e used to assemble SEQ ID NO: 1 1-20 were compared with 
sequences from die Incyte UFESEQ database and public domain databases using BLAST and 
0 implementations of flie Smith-Waterman algoriflmi. Sequences from these databases fliat mate 
SEQ ID N0:ll-20 were assembled into clusters of contiguous and overiapping sequences usm 
assembly algoriflmis such as Phrap (Table 7). Radiation hybrid and genetic mapping data avai 
from pubUc resources such as fl»e Stanford Human Genome Center (SHGQ, Whitehead Institc 
Genome Research (WIGR), and G6n6tiion were used to determine if any of flie clustered seque 
25 had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in die assigt 
of all sequences of diat cluster, including its particular SEQ ID NO:, to tiiat map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The m 
position of an interval, in centiMorgans, is measured relative to tiie terminus of tiie chromoson- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination fiBquencies bet 
30 chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA i 
humans, although this can vary widely due to hot and cold spots of recombmation.) The cM 
distances are based on genetic markers mapped by Genetiion which provide boundaries for rad 
hybrid markers whose sequences were included m each of die clusters. Human genome maps : 
other resources avaUable to die pubUc, such as tiie NCBI "GeneMap'99" World Wide Web sit* 
35 (http-7/www.ncbi.nhn.nih.gov/genemap/), can be employed to determine if previously identifie 
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disease genes map within or in proximity to the intervals indicated above. 
Vn. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
gene and mvolves the hybridi2ation of a labeled nucleotide sequence to a membrane on which RNAs 
5 from a particular cell type or tissue have been bound. (See, e.g., Sambrook. supra, ch. 7; Ausubel 
(1995) supHL ch. 4 and 16.) 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is 
much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the 
10 con5>uter search can be modified to detennine whether any particular match is categorized as exact or 
similar. The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identity 
5 X minimum {length(Seq. 1), length(Seq. 2)} 

15 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100. and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 
product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 

20 calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fractional overlap and quaUty in a 
BLAST alignment. For example, a product score of 100 is produced only for 100% identity over the 

25 entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end. or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 
identity and 100% overlap. 

Alternatively, polynucleotide sequences encoding PP are analyzed with respect to the tissue 

30 sources from which they were derived. For example, some fuU length sequences are assembled, at 
least in part, with overlapping Incyte cDNA sequences (see Example III). Each cDNA sequence is 
derived from a cDNA library constructed from a human tissue. Each human tissue is classified into 
one of the following organ/tissue categories: cardiovascular system; connective tissue; digestive 
system; embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; 

35 germ cells; hemic and immune system; liver, musculoskeletal system; nervous system; pancreas; 
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respiratory system; sense organs; skin; stomatognathic system; imclassified/mixed; or urinary tract 
The number of libraries in each category is counted and divided by the total number of libraries 
across all categories. Similarly, each human tissue is classified into one of the following 
disease/condition categories: cancer, cell line, developmental, mflammation, neurological, trauma, 

5 cardiovascular, pooled, and other, and the number of libraries in each category is counted and divided 
by the total number of libraries across all categories. The resultmg percentages reflect the tissue- and 
disease-specific expression of cDNA encoding PP. cDNA sequences and cDNA library/tissue 
information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA). 
Vin, Extension of PP Encoding Polynucleotides 

10 • Full length polynucleotide sequences were also produced by extension of an appropriate 

fragment of the full length molecule using oligonucleotide primers designed from this fragment. One 
primer was synthesized to initiate 5' extension of the known fragment, and the other primer was 
synthesized to initiate 3' extension of the known fragment. The initial pruners were designed using 
OLIGO 4.06 software (National Biosciences), or another appropriate program, to be about 22 to 30 

15 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target 
sequence at temperatures of about 68 °C to about 72^C. Any stretch of nucleotides which would 
result in hairpin structures and primer-primer dimerizations was avoided. 

Selected human cDNA libraries were used to extend the sequence. If more than one 
extension was necessary or desired, additional or nested sets of primers were designed. 

20 High fidelity amplification was obtained by PGR usmg methods well known in the art PGR 

was performed m 96-well plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg^, Q^H^z^^Ay 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme 
(Life Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer 

25 pair pa A and PCI B: Step 1: 94'C, 3 min; Step 2: 94''C, 15 sec; Step 3: eO^'C. 1 min; Step 4: 68 'C. 
2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68*'C, 5 min; Step 7: storage at 4**C. In the 
alternative, the parameters for primer pair T7 and SK+ were as follows: Step 1: 94*'C, 3 min; Step 2: 
94*'C, 15 sec; Step 3: ST^^C, 1 mm; Step 4: eS^C, 2 mm; Step 5: Steps 2, 3, and 4 repeated 20 times; 
Step 6: eS^'C, 5 min; Step 7: storage at 4*'C 

30 The concentration of DNA in each well was determined by dispensing 100 ill PICOGBffiEN 

quantitation reagent (0.25% (v/v) PIGOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 jxl of undiluted PGR product into each well of an opaque fluorimeter plate (Commg Gostar, 
Acton MA), allowing the DNA to bmd to the reagent The plate was scanned in a Fluoroskan n 
(Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 

35 concentration of DNA. A 5 m1 to 10 /A aliquot of the reaction mixture was analyzed by 
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electrophoiesis on a 1 % agarose gel to determine which reactions were successftd in extending the 
sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-well plates, 
digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 

5 sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech). For 
shotgun sequencmg, the digested nucleotides were separated on low concentration (0.6 to 0.8%) 
agarose gels, fragments were excised, and agar digested with Agar ACE (Promega). Extended clones 
were religated using T4 ligase (New England Biolabs, Beverly MA) into pUC IS vector (Amersham 
Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site 

10 overhangs, and transfected into competent E. coli cells. Transformed cells were selected on 

antibiotic-containing media, and individual colonies were picked and cultured overnight at 37 ''C in 
384-well plates in LB/2x carb liquid media. 

The cells were lysed, and DNA was amplified by PGR using Taq DNA polymerase 
(Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following 

15 parameters: Step 1: 94^C 3 min; Step 2: 94X, 15 sec; Step 3: eO'^C, 1 min; Step 4: 72**C. 2 min; 
Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72**C, 5 min; Step 7: storage at 4*^0. DNA was 
quantified by PICX3GREEN reagent (Molecular Probes) as described above. San5)les with low DNA 
recoveries were reamplified nsing the same conditions as described above. Sanq>les were diluted 
with 20% dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing 

20 primers and the DYENAMIC DIRECr kit (Amersham Pharmacia Biotech) or the ABI PRISM 
BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

Id like manner, fiill length polynucleotide sequences are verified using the above procedure or 
are used to obtain 5* regulatory sequences using the above procedure along with oligonucleotides 
designed for such extension, and an appropriate genomic library. 

25 IX. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO: 1 1-20 are enq)loyed to screen cDNAs, 
genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consistmg of about 20 base 
pairs, is specifically described, essentially the same procedure is used with larger nucleotide 
fragments. Oligonucleotides are designed using state-of-flie-art software such as OLIGO 4.06 

30 software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 of 
[y-32pj adenosine triphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase 
(DuPont NEN. Boston MA). The labeled oligonucleotides are substantially purified using a 
SEPHADEX G-25 superfine size exclusion dextran bead column (Amersham Pharmacia Biotech). 
An aliquot containing 10'' counts per mmute of the labeled probe is used in a typical membrane-based 

35 hybridization analysis of human genomic DNA digested with one of the foDowing endonucleases: 
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Ase I, Bgl n, Eco RI, Pst I, Xba I, or Pvu H (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon 
membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 
hours at 40 ''C To remove nonspecific signals, blots are sequentially washed at room temperature 
5 under conditions of up to, for example, 0. 1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging noeans and 
compared. 

X. Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 

10 photolithography, piezoelectric printing (ink-jet printing. See, e.g., Baldeschweiler, supra.), 
mechanical microspotting technologies, and derivatives thereof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non-porous surface (Schena (1999), 
supra) . Suggested substrates include silicon, silica, glass slides, glass chips, and silicon wafers. 
Alternatively, a procedure analogous to a dot or slot blot may also be used to arrange and link 

15 elements to the surface of a substrate using thermal, UV, chemical, or mechanical bonding 

procedures. A typical array may be produced using available methods and machines well known to 
those of ordinary skill in the art and may contam any appropriate number of elenoents. (See, e.g., 
Schena, M. et al. (1995) Science 270:467-470; Shalon. D. et al. (1996) Genome Res. 6:639-645; 
Marshall, A, and J. Hodgson (1998) Nat. BiotechnoL 16:27-31.) 

20 Full length cDNAs, Expressed Sequence Tags OBSTs), or fragments or oligomers thereof may 

comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
selected using software well known in the art such as LASERGENE software (DNASTAR). The 
array elements are hybridized with polynucleotides in a biological san5)le. The polynucleotides in the 
biological smxple are conjugated to a fluorescent label or other molecular tag for ease of detection. 

25 After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scanner is used to detect hybridization at each array element. Alternatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 
complementarity and the relative abundance of each polynucleotide which hybridizes to an element 
on the microarray may be assessed. In one embodiment, microarray preparation and usage is 

30 described in detail below. 

Tissue or Cefl Sa mplft Pre paration 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 
poly(A)* RNA is purified using the oligo-(dT) cellulose method. Each poly(A)* RNA sample is 
reverse transcribed using MMLV leverse-transcriptase, 0.05 pg/|il oligo-(dT) pruner (21mer), IX 

35 first strand buffer, 0.03 units/jxl RNase inhibitor, 500 juM dATP, 500 /*M dGTP, 500 /*M dTTF, 40 
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pM dCTP, 40 iiM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng poIyCA)"" RNA with 
GEMBRIGHT kits (Incyte). Specific control poly(A)* RNAs are synthesized by in vitro transcription 
from non-coding yeast genomic DNA. After incubation at 37° C for 2 hr, each reaction sample (one 

5 with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and 

incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA. Samples are purified 
using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. 
(CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 

10 then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and 
resuspended in 14 ill 5X SSCyO.2% SDS. 
Microarrav Preparation 

Sequences of the present mvention are used to generate array elements. Each airay element 
is amplified from bacterial cells containing vectors with cloned cDNA inserts. PGR amplification 

15 uses primers complementary to the vector sequences flanking the cDNA insert. Array elements are 
amplified in thirty cycles of PGR from an initial quantity of 1-2 ng to a final quantity greater than 5 
jxg. Amplified array elements are then purified using SEPHACRYL^O (Amersham Pharmacia 
Biotech). 

Purified array elements are inmiobilized on polymer-coated glass slides. Glass microscope 
20 slides (Goming) are cleaned by ultrasound in 0. 1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Gorporation (VWR), West Ghester PA), washed extensively in distilled water, 
and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 
110°Goven. 

25 • Array elements are applied to the coated glass substrate using a procedure described in US 

Patent No. 5,807,522, incorporated herein by reference. 1 /il of the array element DNA, at an average 
concentration of 100 ng//il, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked usmg a STRATALINKER UV-crosslinfcer (Stratagene). 

30 Microarrays are washed at room temperature once in 0.2% SDS and three tunes in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered salfaie (PBS) (Tropix, Inc., Bedfoirf MA) for 30 minutes at 60** C followed by washes in 
0.2% SDS and distilled water as before. 
Hybridization 
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Hybridization reactions contain 9 ptl of sample mixture consisting of 0.2 fig each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and covered 
widi an 1.8 cm^ coverslip. The arrays are transferred to a waterproof chamber having a cavity just 

5 slightly larger than a microscope slide. The chamber is kept at 100% humidity internally by tiie 
addition of 140 of 5X SSC in a comer of the chamber. The chamber containing the arrays is 
incubated for about 6.5 hours at 60° C. The arrays are washed for 10 min at 45° C in a first wash 
buffer (IX SSC, 0.1% SDS), three times for 10 mdnutes each at 45°C in a second wash buffer (O.IX 
SSC), and dried. 

10 Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser Ught is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The sUde 

15 containing the array is placed on a computer-controlled X- Y stage on the microscope and raster- 
scanned past the objective. The L8 cm x 1.8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 

20 Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 
fihers positioned between the army and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for CyS. Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra fiom both fluorophores simultaneously. 

25 The sensitivity of the scans is typically calibrated using the signal mtensity generated by a 

cDNA control species added to the sanq)le mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the mtensity of the signal at that 
location to be correlated with a weight ratio of hybridizing species of 1:100,000. When two samples 
from different sources (e.g., representing test and control cells), each labeled with a different 

30 fluorophore, are hybridized to a single array for the purpose of identifying genes that are 

differentially expressed, the calibration is done by labeling samples of the calibrating cDNA with the 
two fluorophores and adding identical amounts of each to the hybridization noixture. 

The output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-^gital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-con5)atible PC 
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computer. The digitized data ate displayed as an image where the signal intensity is mapped using a 
linear 20-color transfonnaf ion to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are fiurst corrected for optical crosstalk (due to overlapping 

5 emission spectra) between the fluorophores using each fluorophore's emission spectrum. 

A grid is sup^jnoqposed over the fluorescence signal image such that the signal from each 
spot is centered in each element of the grid. The fluorescence signal within each element is then 
integrated to obtain a nmnerical value coirespondmg to the average intensity of the signal. The 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 

10 XI. Complementary Polynucleotides 

Sequences complementary to the PP-encoding sequences, or any parts thereof, are used to 
detect, decrease, or inhibit expression of naturally occurring PP. Although use of oligonucleotides 
conq)rising from about 15 to 30 base pairs is described, essentially the same procedure is used with 
smaller or with larger sequence fragments. Appropriate oligonucleotides are designed using OLJGO 

15 4.06 software (National Biosciences) and the coding sequence of PP. To inhibit transcription, a 
complementary oligonucleotide is designed from the most unique 5' sequence and used to prevent 
promoter binding to the coding sequence. To inhibit translation, a complementary oligonucleotide is 
designed to prevent ribosomal binding to the PP-encoding transcript. 
Xn. Expression of PP 

20 Expression and purification of PP is achieved using bacterial or virus-based expression 

systems. For expression of PP in bacteria, cDNA is subcloned into an appropriate vector containing 
an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription. 
Examples of such promoters include, but are not limited to, the trp-lac (tac) hybrid promoter and the 
T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element. 

25 Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21<DE3). Antibiotic 
resistant bacteria express PP upon induction wifli isopropyl beta-D-thiogalactqpyranoside (IPTG). 
Expression of PP in eukaryotic cells is achieved by hifectiiig insect or mammalian cell lines with 
recombinant Autopraphica califomica nuclear polyhedrosis virus (AcMNPV), conunonly known as 
baculovirus. The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding PP by 

30 either homologous recombination or bacterial-mediated transposition involving transfer plasmid 
intermediates. Viral infectivity is maintained and the strong polyhedrm promoter drives high levels 
of cDNA transcription. Recombinant baculovkus is used to infect Spodoptera frugiperda (Sf9) insect 
cells in most cases, or human hepatocytes, in some cases. Infection of the latter requires additional 
genetic modifications to baculovirus. (See Engelhard, E.K. et al. (1994) Proc. Natl. Acad. Sci. USA 

35 91:3224-3227; Sandig, V. et al. (1996) Huhl Gene Ther. 7: 1937-1945.) 
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In most expression systems, PP is synthesized as a fusion protein with, e.g., glutathione S- 
transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, pennitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from crude ceU lysates. GST, a 26- 
Idlodalton enzyme from Schistosoma iaponicum , enables the purification of fusion proteins on 

5 immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham 
Pharmacia Biotech). Following purification, the GST moiety can be proteolytically cleaved from PP 
at specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity purification 
using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6- 
His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins 

10 (QIAGEN). Methods for protein expression and purification are discussed in Ausubel (1995, supra, 
ch. 10 and 16). Purified PP obtained by these methods can be used directly in the assays shown in 
Examples XVI, XVH, XVm, and XIX where applicable. 
Xni. Functional Assays 

PP function is assessed by expressing the sequences encoding PP at physiologically elevated 

15 levels in manmialian cell culture systems. cDNA is subcloned into a mammalian expression vector 
containmg a strong promoter that drives high levels of cDNA expression. Vectors of choice include 
PCMV SPORT (Life Technologies) and PCR3.1 (lavitrogen, Carlsbad CA), both of which contam 
the cytomegalovirus promoter. 5-10 y% of recombinant vector are transiently transfected into a 
human cell Une, for exan^>le, an endothelial or hematopoietic cell Ime, using either liposome 

20 formulations or electroporation. 1-2 /ig of an additional plasmid containing sequences encoding a 
marker protein are co-transfected. Expression of a marker protein provides a means to distinguish 
transfected cells from nontransf ected cells and is a reliable predictor of cDNA expression from the 
reconibinant vector. Marker protems of choice includCt e.g,, Green Fluorescent Protein (GFP; 
Qontech). CD64, or a CD64-GFP fiision protein. Flow cytometry (FCM), an automated, laser optics- 

25 based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to evaluate 
the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the iqptake of 
fluorescent molecules that diagnose events precedmg or coincident with cell death. These events 
include changes in nuclear DNA content as measured by staining of DNA with propidium iodide; 
changes in cell size and granidarity as measured by forward light scatty and 90 degree side light 

30 scatter; down-iegulation of DNA synthesis as measured by decrease in hromodeoxyuridme uptake; 
alterations m expression of cell surface and intracellular proteins as measured by reactivity with 
specific antibodies; and alterations in plasma membrane composition as measured by the binding of 
fluoiescein-conjugated Annexin V protein to the cell surface. Methods m flow cytometry are 
discussed in Ormerod, M.G. (1994) Flow Cvtometrv. Oxford, New York NY. 

35 The mfluence of PP on gene expression can be assessed using highly purified populations of 
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ceUs transfected with sequences encoding PP and either CD64 or CD64-GFP. CD64 and CD64-GFP 
are expressed on the surface of transfected cells and bind to conserved regions of human ^ 
immunoglobulm G (IgG). Transfected cells are efficiently separated from nontransfected cells usmg 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success 
5 NY). mRNA can be purified from the cells usmg methods well known by those of skill in the art. 
Expression of mRNA encoding PP and other genes of interest can be analyzed by northern analysis or 
microarray techniques. 

XIV. Production of PP Specific Antibodies 

PP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 

10 Harrington, M.G. (1990) Methods EnzymoL 182:488-495). or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the PP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogetiicity, and a corrcsponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art Methods for 

15 selection of appropriate epitopes, such as those near the C-terminus or m hydrophilic regions are well 
described m the art. (See, e.g., Ausubel, 1995, supra, ch. 11.) 

Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 43 1 A 
peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldrich, St Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 

20 increase immunogenicity. (See, e.g., Ausubel, 1995, supra. ) Rabbits are immunized with the 
oligopeptide-KLH con5)lex in complete Freund^ adjuvant Resulting antisera are tested for 
antipeptide and anti-PP activity by, for example, binding the peptide or PP to a substrate, blocking 
with 1% BSA, reacting wifli rabbit antisera, washing, and reacting with radio-iodinated goat anti- 
rabbit IgG. 

25 XV. Puriflcatlon of NatnraUy Occurring PP Using Spedfic Antibodies 

Naturally occurring or recombinant PP is substantially purified by immimoafGnity 
chromatography using antibodies specific for PP. An umnunoafBnity column is constructed by 
covalentiy coupling anti-PP antibody to an activated chromatographic resin, such as CNBr-activated 
SEPHAROSE (AmiM-shain Pharmacia Biotech). After tiie coupling, the lesin is blocked and washed 

30 according to the manufacturer^ instructions. 

Media containing PP are passed over the immunoaffinity column, and die column is washed 
under conditions that allow the preferential absorbance of PP (e.g., high ionic strength buffers in the 
presence of detergent). The column is eluted under conditions that disrupt antibody/PP binding (e.g., 
a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such as urea or thiocyanate ion), and 

35 PP is collected. 
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XVI. Identification f Molecules Which Interact with PP 

PP, or biologically active fragments thereof, are labeled with '^I Bolton-Hunter reagent. 
(See, e.g., Bolton A.E. and W.M. Hunter (1973) BiochenL J. 133:529-539.) Candidate molecules 
previously arrayed in the wells of a multi-well plate are incubated with the labeled PP, washed, and 
5 any wells with labeled PP complex are assayed. Data obtained using different concentrations of PP 
are used to calculate values for the number, affinity, and association of PP with the candidate 
molecules. 

Alternatively, molecules interacting with PP are analyzed using the yeast two-hybrid system 
as described m Fields, S. and O. Song (1989) Nature 340:245-246, or usmg commercially available 

10 kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech). 

PP may also be used in the PATHCAUJNG process (CuraGen Corp., New Haven CT) which 
employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
between the proteins encoded by two large Ubraries of genes (Nandabalan, K. et al. (2000) U.S. 
Patent No. 6,057,101). 

15 XVn. Demonstration of PP Activity 

PP activity is measured by the hydrolysis of para-nitrophenyl phosphate (PNPP). PP is 
mcubated together with PNPP m HEPES buffer pH 7.5, in the presence of 0.1% p-mercaptoethanol at 
37 °C for 60 min. The reaction is stopped by the addition of 6 ml of 10 N NaOH (Diamond, R.H. et 
al. (1994) Mol. Cell. Biol. 14:3752-62). Alternatively, acid phosphatase activity of PP is 

20 demonstrated by incubatmg PP-containing extract with 100 ill of 10 mM PNPP in 0.1 M sodium 

citrate, pH 4.5, and 50 /xl of 40 mM NaCl at 37**C for 20 min. The reaction is stopped by the addition 
of 0.5 ml of 0.4 M glycine/NaOH, pH 10.4 (Saftig, P. et al. (1997) J. Biol. Chem. 272:18628-18635). 
The increase in light absorbance at 410 nm resulting from the hydrolysis of PNPP is measured using a 
spectrophotometer. The increase in light absorbance is proportional to the activity of PP in the assay. 

25 In the alternative, PP activity is determined by measuring the amount of phosphate removed 

from a phosphoiylated protein substrate. Reactions are performed with 2 or 4 nM enzyme in a final 
volume of 30 nl contaming 60 mM Tris, pH 7.6, 1 mM EDTA, 1 mM EGTA, 0.1% p-meicaptoetihanol 
and 10 iiM substrate, ^^-labeled on serine/threonine or tyrosine, as appropriate. Reactions are 
initiated with substrate and incubated at 30** C for 10-15 min. Reactions are quenched with 450 jil of 

30 4% (w/v) activated charcoal m 0.6 M HCl, 90 mM Na4p207, and 2 mM NaH2p04, then centrifuged at 
12,000 X ^ for 5 min. Acid-soluble is quantified by liquid scintillation counting (Sinclair, C et 
al. (1999) J. Biol. Chem. 274:23666-23672). 
XVnL Identification of PP Inhibitors 

Compounds to be tested are arrayed in the wells of a 384-well plate in varying concentrations 

35 along with an appropriate buffer and substrate, as described in the assays in Example XVn. PP 
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activity is measured for each well and the ability of each conq)ouiid to inhibit PP activity can be 
determined, as well as the dose-response kinetics. This assay could also be used to identify molecules 
which enhance PP activity. 
XDL Identification of PP Substrates 

5 A PP "substrate-trapping" assay takes advantage of the increased substrate affinity that may 

be conferred by certain mutations m the PTP signature sequence. PP bearing these mutations form a 
stable complex with their substrate; this con^lex may be isolated biochemically. Site-directed 
mutagenesis of invariant residues m the PTP signature sequence in a clone encoding the catalytic 
domain of PP is performed using a method standard in the art or a conamercial kit, such as the 

10 MUTA-GENE kit fi-om BIO-RAD. For expression of PP mutants in Escherichia coli , DNA fragments 
containing the mutation are exchanged with the corresponding wild-type sequence in an expression 
vector bearing the sequence encoding PP or a glutathione S-transferase (GST)-PP fusion protem. PP 
mutants are expressed in E. coli and purified by chromatography. 

The expression vector is transfected into CX)S1 or 293 cells via calcium phosphate-mediated 

15 transfection with 20 fig of CsCl-purified DNA per 10-cm dish of cells or 8 /xg per 6-cm dish. Forty- 
eight hours after transfection, cells arc stimulated with 100 ng/ml epidermal growth factor to increase 
tyrosine phosphorylation in cells, as the tyrosine kinase EGFR is abundant in COS cells. Cells are 
lysed in 50 mM Tris-HQ, pH 7.5/5 mM EDTA/150 mM NaCl/1% Triton X-100/5 mM iodoacetic 
acid/10 mM sodium phosphate/10 mM NaF/5 /xg/ml leupeptin/5 jLig/ml apiotmin/l mM benzamidine 

20 ( 1 ml pa: 10-cm dish, 0.5 ml per 6-cm dish). PP is immunopiecipitated from lysates with an 

appropriate antibody. GST-PP fusion proteins are precipitated with glutathione-Sepharose, 4 ftg of 
mAb or 10 fil of beads respectively per nog of cell lysate. Complexes can be visualized by PAGE or 
further purified to identify substrate molecules (Flint, AJ. et al. (1997) Proc. NatL Acad. Sci. USA 
94:1680-1685). 

25 

Various modifications and variations of the described methods and systems of the invention 
will be apparent to those skilled in the ait without departing from the scope and spirit of the 
invention. Although the invention has been described in connection with certain enabodiments, it 
should be understood that the invention as claimed should not be unduly limited to such specific 
30 embodiments. Indeed, various modifications of the described modes for carrying out the invention 
which are obvious to those skilled in molecular biology or related fields are intended to be within the 
scope of the following claims. 
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What is claimed is: 

1. An isolated polypeptide selected from the group consistmg of: 

a) a polypeptide comprising an amino.acid sequence selected from the group consistmg of 
SEQIDNO:M0, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical 
to an amino acid sequence selected from the group consisting of SEQ ID NO: MO, 

c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-10, and 

d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from 
the group consisting of SEQ ID NO:1-10. 

2. An isolated polypeptide of claim 1 selected from the group consisting of SEQ ID NO: 1- 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

5. An isolated polynucleotide of claim 4 selected from the group consisting of SEQ ID 
NO:11-20. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

25 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 

30 9. A metiiod of produchoig a polypeptide of claun 1, the method comprising: 

a) culturing a cell under conditions suitable for expression of the polypq?tide, wherein said 
cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide 
conq)rises a promoter sequence operably linked to a polynucleotide encoding the polypeptide of 
claim 1, and 
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b) recovering the polypeptide so expressed. 

10. An isolated antibody which specifically binds to a polypeptide of claim 1. 

1 1. An isolated polynucleotide selected from the group consisting of : 

a) a polynucleotide conq)rising a polynucleotide sequence selected from the group consisting 
ofSEQIDNO:ll-20, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% 
identical to a polynucleotide sequence selected from ttie group consisting of SEQ ID NO:l 1-20, 

c) apolynucleotideconq)lementary to a polynucleotide of a), 

d) a polynucleotide conqjlementary to a polynucleotide of b), and 

e) an RNA equivalent of a)-d). 

12. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotideof claim 11. ' 

13. A mefliod of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 1 1, the method comprising: 

a) hybridizing the sample witii a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in die sample, and which probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization 
complex is formed between said probe and said target polynucleotide, or fragments tiiereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 
present, the amount thereof. 

14. A method of claun 13, wherein the probe comprises at least 60 contiguous nucleotides. 

15. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 1 1, the method comprising: 

a) amplifying said target polynucleotide or fragment tiiereof using polymerase chain reaction 
amplification, and 

b) detecting the presence or absence of said ampUfied target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 
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16. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable 
excipi^t. 

17. A composition of claim 16, wherein the polypeptide has an amino acid sequence selected 
5 from the group consisting of SEQ ID NO: 1-10. 

18. A method for treating a disease or condition associated with decreased expression of 
functional PP, comprising administering to a patient in need of such treatment the composition of 
claim 16. 

10 

19. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

15 

20. A composition comprising an agonist compound identified by a method of claim 19 and 
a pharmaceutically acceptable excipient 

21. A method for treating a disease or condition associated with decreased expression of 

20 functional PP, comprising administering to a patient in need of such treatment a conq)Osition of claim 
20. 

22. A method of screening a compound for effectiveness as an antagonist of a polypeptide of 
claim 1, the method comprising: 

25 a) exposing a saic^le comprising a polypeptide of claim 1 to a confound, and 

b) detecting antagonist activity in the sample. 

23. A composition comprismg an antagonist conq>ound idaitified by a method of claim 22 
and a pharmaceutically acceptable excipient. 

30 

24. A method for treating a disease or condition associated witii overexptession of functional 
PP, comprising adniinistering to a patient in need of such treatment a composition of claim 23. 

25. A method of screening for a con5)ound that specifically bmds to the polypeptide of claim 



wo 02/10363 



PCT/USOl/23716 



1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound mider suitable 

conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compound, thereby identifying a 
5 compound that specifically binds to the polypeptide of claim 1. 

26. A method of screening for a compound that modulates the activity of the polypeptide of 
claim 1 , the method conq)rising: 

a) combining the polypeptide of claim 1 with at least one test con5)ound under conditions 
10 permissive for the activity of the polypeptide of claim 1, 

b) assessing the activity of the polypeptide of claim 1 in the presence of the test compound, 

and 

c) conqjaring the activity of the polypeptide of claim 1 m the presence of the test con5)Ound 
with the activity of the polypeptide of claim 1 in the absence of the test con^iound, wherein a change 

15 in the activity of the polypeptide of claun 1 iu the presence of the test compound is indicative of a 
compound that modulates the activity of the polypeptide of claim 1. 

27. A method of screenmg a compound for effectiveness m altering expression of a target 
polynucleotide, wherem said target polynucleotide con^rises a sequence of claim 5, the method 

20 comprising: 

a) exposing a sample con[q)rising the target polynucleotide to a conpound, under conditions 
suitable for the expression of the target polynucleotide, 

b) detecting alteied expression of the target polynucleotide, and 

c) conparing the expression of the target polynucleotide in the presence of varying amounts 
25 of the cQn:pound and in the absence of the coaq)ound. 

28. A method of assessing toxicity of a test compound, the method comprising: 

a) tieating a biological sanple contauung nucleic acids with the test compound, 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at 
30 least 20 contiguous nucleotides of a polynucleotide of claun 1 1 under conditions whereby a specific 

hybridization complex is formed between said probe and a target polynucleotide in the biological 
sample, said target polynucleotide con5)rismg a polynucleotide sequence of a polynucleotide of claim 
11 or fragment thereof, 

c) quantifying the amount of hybridization complex, and 

35 d) comparing the amount of hybridization complex in the treated biological sample with the 



94 



wo 02/10363 



PCT/USOl/23716 



amount of hybridization complex in an untreated biological sample, wherdin a difference in the 
amount of hybridization complex in the treated biological sample is indicative of toxicity of the test 
compound. 

5 29. A diagnostic test for a condition or disease associated with the expression of PP in a 

biological sample, the method conqirising: 

a) combining the biological sample with an antibody of claim 10. under conditions suitable 
for the antibody to bind the polypeptide and form an antibody -.polypeptide con^lex, and 

b) detecting tiie complex, wherein the presence of the complex correlates with tiie presence 
10 of the polypeptide m the biological sample. 

30. The antibody of claim 10, wherdn the antibody is: 

a) a chimeric antibody, 

b) a single chain antibody, 
15 c) a Fab fragment, 

d) a F(ab')2 fragment,, or 

e) a humanized antibody. 

31. A composition con^rising an antibody of claun 10 and an acceptable excipient 

20 

32. A mefliod of di^osing a condition or disease associated witii tiie expression of PP in a 
subject, comprising administering to said subject an effective amount of flie composition of claim 31. 

33. A conq)Osition of claim 31, wherein tiie antibody is labeled. 

25 

34. A method of diagnosing a condition or disease associated witii flie expression of PP in a 
subject, comprising administering to said subject an effective amount of ttie composition of claim 33. 

35 . A mefliod of preparing a polyclonal antibody witii tiie specificity of tiie antibody of claun 

30 10, the method conq>risiag: 

a) itnmiini7.ing an animal witii a polypeptide having an amino acid sequence selected from 
tiie group consisting of SEQ ID NO:H0, or an immunogenic fragment tiiereof, under conditions to 

elicit an antibody response, 

b) isolating antibodies from said aiumal, and 
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c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal 
antibody which binds specifically to a polypeptide havmg an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-10. 

5 36. An antibody produced by a method of claim 35. 

37. A composition conq>rising the antibody of claim 36 and a suitable carrier. 

38. A method of making a monoclonal antibody with the specificity of the antibody of claim 
10 10, the method comprising: 

a) immunizing an ammal with a polypeptide having an amino acid sequence selected from 
the group consistmg of SEQ ID N0:l-10, or an uimiunogenic fiagmait thereof, under conditions to 
elicit an antibody response, 

b) isolating antibody producing cells from die animal, 

15 c) fusing the antibody produdng cells wifli immortalized cells to form monoclonal antibody- 

producing hybridoma cells, 

d) culturing the hybridoma cells, and 

e) isolating from the culture monoclonal antibody which binds specifically to a polypeptide 
having an amino add sequence selected from the group consisting of SEQ JD NO: 1-10. 

20 

39. A monoclonal antibody produced by a n^thod of claim 38. 

40. A composition comprismg the antibody of claim 39 and a suitable carrier. 

25 41. The antibody of claim 10, wherein the antibody is produced by screeiung a Fab 

expression library. 

42. The antibody of claim 10, wherein the antibody is produced by screening a recombinant 
inmiunoglobulin library. 

30 . 

43. A mefliod of detectmg a polypeptide havmg an amino acid sequence selected from the 
group consisting of SEQ ID NO:1-10 in a sample, the metiiod comprismg: 

a) incubating the antibody of claim 10 with a sanq)le under conditions to allow specific 
binding of the antibody and the polypeptide, and 
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b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide having an amino acid sequence selected firom the group consistmg of SEQ ID NO: MO in 
the sample. 

5 44. A method of purifying a polypeptide having an amino acid sequence selected fiom the 

group consistmg of SEQ ID NO: MO from a sart^le. the method conqnising: 

a) incubating the antibody of claim 10 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) separating the antibody from the sample and obtaining the purified polypeptide having an 
10 amino acid sequence selected from the group consisting of SEQ ID NO: MO. 

45. A polypeptide of claim 1, conq^rising the amino acid sequence of SEQ ID NOil. 

46. A polypeptide of claim 1, con^rising the amino acid sequence of SEQ ID NO:2, 

15 

47. A polypeptide of claim 1, comprising the amino acid sequaice of SEQ ID N0:3. 

48. A polypeptide of claim 1, c<Miq)rising the amino acid sequence of SEQ ID N0:4, 
20 49. A polypeptide of claun 1, con^rismg the amino acid sequence of SEQ ID NO:5. 

50- A polypeptide of claim 1, con?)rismg the ammo acid sequence of SEQ ID NO:6. 

51. A polypeptide of claim 1, con^nsing the amino acid sequence of SEQ ID N0:7- 

52. A polypeptide of claim 1, conq>rising the ammo acid sequence of SEQ ID NO:8. 

53. A polypeptide of claim 1, conoprising the amino acid sequence of SEQ ID NO:9. 
30 54. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 10. 

55. A polynucleotide of claim 11, comprising the polynucleotide sequence of SEQ ID 

NO:ll. 
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56. A polynucleotide of claim 11, coiiq>nsing die polynucleotide sequence of SEQ ID 

N0:12. 

57. A polynucleotide of claim 1 1 , ccm^rising the polynucleotide sequence of SEQ ID 

5 NO: 13. 

58. Apolynucleotideof claimll.comprisingthepolynucleotidesequenceof SEQID 

Nai4. 

10 59. A polynucleotide of claim 11, comprising flie polynucleotide sequence of SEQ ID 

NO: 15. 

60. Apolynucleotideof claim 11, conpisingtiiepolynucleotidesequenceof SEQ ID 

N0:16. 

15 

61. A polynucleotide of claim 1 1, conq>rising tbe polynucleotide sequence of SEQ ID 

N0:17. 

62. A polynucleotide of claim 1 1 , conpcising the polynucleotide sequence of SEQ ID 

20 NO:18. 

63. A polynucleotide of claim 1 1, co^^)rising the polynucleotide sequence of SEQ ID 

N0:19. 

25 64. A polynucleotide of claim 1 1, conq>rising the polynucleotide sequence of SEQ ID 

NO:20. 
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<210> 1 

<211> 545 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 1905692CD1 



<400> 1 
























He 


He 


Met 


Asn 


Glu 


Ser 


Pro 


Asp 


Pro 


Thr 


Asp 


Leu 


Ala 


Gly 


Val 


1 
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10 










15 


Glu 


Leu 


Gly 


Pro 


Asn 


Asp 


Ser 


Pro 


Gin 


Thr 


Ser 


Glu 


Phe 


Lys 


Gly 








20 










25 










30 


Ala 


Thr 


Glu 


Glu 


Ala 


Pro 


Ala 


Lys 


Glu 


Ser 


Pro 


His 


Thr 


Ser 


Glu 








35 










40 










45 


Phe 


Lys 


Gly 


Ala 


Ma 


Arg 


Val 


Ser 


Pro 


He 


Ser 


Glu 


Ser 


Val 


Leu 






50 










55 










60 


Ala 


Arg 


Leu 


Ser 


Lys 


Phe 


Glu 


Asp 


Glu 


Asp Ala 


Glu 


Asn 


Val 


Ala 








65 










70 










75 


Ser 


Tyr 


Asp 


Ser 


Lys 


He 


Lys 


Lys 


He 


Val 


His 


Ser 


He 


Val 


Ser 






80 










85 










90 


Ser 


Phe 


Ala 


Phe 


Gly 


Leu 


Phe 


Gly 


Val 


Phe 


Leu 


Val 


Leu 


Leu 


Asp 
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100 










105 


Val 


Thr 


Leu 


He 


Leu 


Ala 


Asp 


Leu 


He 


Phe 


Thr 


Asp 


Ser 


Lys 


Leu 








110 










115 










120 


Tyr 


He 


Pro 


Leu 


Glu 


Tyr 


Arg 


Ser 


He 


Ser 


Leu 


Ala 


He 


Ala 


Leu 
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125 


130 








135 


Phe Phe Leu Met Asp Val Leu Leu Arg Val 


Phe 


Val 


Glu 


Arg Arg 


140 


145 








150 


Gin Gin Tyr Phe Ser Asp Leu Phe Asn 


He 


Leu 


Asp 


Thr 


Ala He 


155 


160 








165 


lie Val He Leu Leu Leu Val Asp Val Val 


Tyr 


He 


Phe 


Phe Asp 


170 


175 








180 


He Lvs Leu Leu Arg Asn He Pro Arg Trp 


Thr 


His 


Leu 


Leu Arg 


185 


190 








195 


Leu Leu Arg Leu He He Leu Leu Arg 


He 


Phe 


His 


Leu 


Phe His 


200 


205 








210 


Gin Lys Arg Gin Leu Glu Lys Leu He Arg 


Acg 


Arg 


Val 


Ser Glu 


215 


220 








225 


Asn Lys Arg Arg Tyr Thr Arg Asp Gly Phe 


Asp 


Leu 


Asp 


Leu Thr 


230 


235 








240 


Tyr Val Thr Glu Arg He He Ala Met 


Ser 


Phe 


Pro 


Ser 


Ser Gly 


245 


250 








255 


Arg Gin Ser Phe Tyr Arg Asn Pro He 


Lys 


Val 


He 


Pro 


Tyr Arg 


260 


265 








270 


Asp Met Thr Tyr He Leu Phe He Leu Gly 


Glu 


Arg 


Ala 


Tyr Asp 


275 


280 








285 


Pro Lys His Phe His Asn Arg Val Val Arg 


He 


Met 


He 


Asp Asp 


290 


295 








300 


His Asn Val Pro Thr Leu His Gin Met 


Val 


Val 


Phe 


Thr 


Lys Glu 


305 


310 








315 


Val Asn Glu Trp Met Ala Gin Asp Leu Glu 


Asn 


He 


Val 


Ala He 


320 


325 








330 


His Cys Lys Gly Gly Thr Asp Arg Thr Gly 


Thr 


Met 


Val 


Cys Ala 


335 


340 








345 


Phe Leu He Ala Ser Glu He Cys Ser 


Thr 


Ala 


Lys 


Glu 


Ser Leu 


350 


355 








360 


Tyr Tyr Phe Gly Glu Arg Arg Thr Asp Lys 


Thr 


His 


Ser 


Glu Lys 


365 


370 








375 


Phe Gin Gly Val Lys Thr Pro Ser Gin 


Lys 


Arg 


Tyr 


Val 


Ala Tyr 


380 


385 








390 


Phe Ala Gin Val Lys His Leu Tyr Asn Trp 


Asn 


Leu 


Pro 


Pro Arg 


395 


400 








405 


Arg He Leu Phe He Lys His Phe He 


He 


Tyr 


Ser 


He 


Pro Arg 


410 


415 








420 


Tyr Val Arg Asp Leu Lys He Gin He 


Glu 


Met 


Glu 


Lys 


Lys Val 


425 


430 








435 


Val Phe Ser Thr He Ser Leu Gly Lys 


Cys 


Ser 


Val 


Leu 


Asp Asn 


440 


445 








450 


He Thr Thr Asp Lys He Leu He Asp 


Val 


Phe 


Asp 


Gly 


Pro Pro 


455 


460 








465 


Leu Tyr Asp Asp Val Lys Val Gin Phe 


Phe 


Ser 


Ser 


Asn 


Leu Pro 


470 


475 








480 


Thr Tyr Tyr Asp Asn Cys Ser Phe Tyr 


Phe 


Trp 


Leu 


His 


Thr Ser 


485 


490 








495 


Phe He Glu Asn Asn Arg Leu Tyr Leu 


Pro 


Lys 


Asn 


Glu 


Leu Asp 


500 


505 








510 


Asn Leu His Lys Gin Lys Ala Arg Arg 


He 


Tyr 


Pro 


Ser 


Asp Phe 


515 


520 








525 


Ala Val Glu He Leu Phe Gly Glu Lys 


Met 


Thr 


Ser 


Ser 


Asp Val 


530 


535 








540 



Val Ala Gly Ser Asp 
545 

<210> 2 . 
<211> 360 
<212> PRT 

<213> Homo sapiens 
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<220> 

<221> misc_feature 

<223> Incyte ID No: 7476908CD1 

<400> 2 , * ^-1 

Met lie Glu Asp Thr Met Thr Leu I»eu Ser Leu Leu Gly Arg lie 
15 IP 15 

Met Arg Tyr Phe Leu Leu Arg Pro Glu Thr Leu Phe Leu Leu Cys 

20 25 30 

He Ser Leu Ala Leu Trp Ser Tyr Phe Phe His Thr Asp Glu Val 

35 40 45 

Lvs Thr He Val Lys Ser Ser Arg Asp Ala Val Lys Met Val Lys 

50 55 60 

Ser Lys Val Ala Glu Thr Met Gin Asn Asp Arg Leu Gly Gly Leu 

65 70 75 

Asp Val Leu Glu Ala Glu Phe Ser Lys Thr Trp Glu Phe Lys Asn 

80 85 90 

His Asn Val Ala Val Tyr Ser He Gin Gly Arg Arg Asp His Met 

95 100 105 

Glu Asp Arg Phe Glu Val Leu Thr Asp Leu Ala Asn Lys Thr His 
110 115 120 

Pro Ser He Phe Gly He Phe Asp Gly His Gly Gly Glu Thr Ala 
125 130 135 

Ala Glu Tyr Val Lys Ser Arg Leu Pro Glu Ala Leu Lys Gin His 

140 145 150 

Leu Gin Asp Tyr Glu Lys Asp Lys Glu Asn Ser Val Leu Ser Tyr 

155 160 165 

Gin Thr He Leu Glu Gin Gin He Leu Ser He Asp Arg Glu Met 

170 175 180 

Leu Glu Lys Leu Thr Val Ser Tyr Asp Glu Ala Gly Thr Thr Cys 

185 190 195 

Leu He Ala Leu Leu Ser Asp Lys Asp Leu Thr Val Ala Asn Val 

200 205 210 

Gly Asp Ser Arg Gly Val Leu Cys Asp Lys Asp Gly Asn Ala He 

215 220 225 

Pro Leu Ser His Asp His Lys Pro Tyr Gin Leu Lys Glu Arg Lys 

230 235 240 

Arg He Lys Arg Ala Gly Gly Phe lie Ser Phe Asn Gly Ser Trp 

245 250 255 

Arg Val Gin Gly He Leu Ala Met Ser Arg Ser Leu Gly Asp Tyr 

260 265 270 

Pro Leu Lys Asn Leu Asn Val Val He Pro Asp Pro Asp He Leu 

275 280 285 

Thr Phe Asp Leu Asp Lys Leu Gin Pro Glu Phe Met He Leu Ala 

290 295 300 

Ser Asp Gly Leu Trp Asp Ala Phe Ser Asn Glu Glu Ala Val Arg 

305 310 315 

Phe He Lys Glu Arg Leu Asp Glu Pro His Phe Gly Ala Lys Ser 

320 325 330 

He Val Leu Gin Ser Phe Tyr Arg Gly Cys Pro Asp Asn He Thr 

335 340 345 

val Met val Val Lys Phe Arg Asn Ser Ser Lys Thr Glu Glu Gin 

350 355 360 



<210> 3 
<211> 355 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> miscfeature 

<223> Incyte ID No: 7708162CD1 
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<400> 3 

Met Asp Phe Leu His Arg Asn Gly Val Leu He He Gin His Leu 
1 5 10 15 

Gin Lys Asp Tyr Arg Ala Tyr Tyr Thr Phe Leu Asn Phe Jfet Ser 
20 25 30 

Asn Val Gly Asp Pro Arg Asn He Phe Phe He Tyr Phe Pro Leu 
35 40 45 

Cys Phe Gin Phe Asn Gin Thr Val Gly Thr Lys Met He Trp Val 
50 55 60 

Ala Val He Gly Asp Trp Leu Asn Leu He Phe Lys Trp He Leu 
65 70 75 

Phe Gly His Arg Pro Tyr Trp Trp Val Gin Glu Thr Gin He Tyr 
80 85 90 

Pro Asn His Ser Ser Pro Cys Leu Glu Gin Phe Pro Thr Thr Cys 
95 100 * 105 

Glu Thr Gly Pro Gly Ser Pro Ser Gly His Ala Met Gly Ala Ser 

110 115 120 

Cys Val Trp Tyr Val Met Val Thr Ala Ala Leu Ser His Thr Val 

125 130 135 

Cys Gly Met Asp Lys Phe Ser He Thr Leu His Arg Leu Thr Trp 

140 145 150 

Ser Phe Leu Trp Ser Val Phe Trp Leu He Gin He Ser Val Cys 

155 160 165 

He Ser Arg Val Phe He Ala Thr His Phe Pro His Gin Val He 

170 175 180 

Leu Gly Val He Gly Gly Met Leu Val Ala Glu Ala Phe Glu His 

185 190 195 

Thr Pro Gly He Gin Thr Ala Ser Leu Gly Thr Tyr Leu Lys Thr 

200 205 210 

Asn Leu Phe Leu Phe Leu Phe Ala Val Gly Phe Tyr Leu Leu Leu 

215 220 225 

Arg Val Leu Asn He Asp Leu Leu Trp Ser Val Pro He Ala Lys 

230 235 240 

Lys Trp Cys Ala Asn Pro Asp Trp He His He Asp Thr Thr Pro 

245 250 255 

Phe Ala Gly Leu Val Arg Asn Leu Gly Val Leu Phe Gly Leu Gly 

260 265 270 

Phe Ala He Asn Ser Glu Met Phe Leu Leu Ser Cys Arg Gly Gly 

275 280 285 

Asn Asn Tyr Thr Leu Ser Phe Arg Leu Leu Cys Ala Leu Thr Ser 

290 295 300 

Leu Thr He Leu Gin Leu Tyr His Phe Leu Gin He Pro Thr His 

305 310 315 

Glu Glu His Leu Phe Tyr Val Leu Ser Phe Cys Lys Ser Ala Ser 

320 325 330 

He Pro Leu Thr Val Val Ala Phe He Pro Tyr Ser Val His Met 

335 340 345 

Leu Met Lys Gin Ser Gly Lys Lys Ser Gin 

350 355 

<210> 4 

<211> 493 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 7473603CD1 

<400> 4 

Met Leu Glu Ser Ala Glu Gin Leu Leu Val Glu Asp Leu Tyr Asn 

1 5 10 15 

Arg Val Arg Glu Lys Met Asp Asp Thr Ser Leu Tyr Asn Thr Pro 
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20 25 30 

Cys Val Leu Asp Leu Gin Arg Ala Leu Val Gin Asp Arg Gin Glu 
35 40 45 

Ala Pro Trp Asn Glu Val Asp Glu Val Trp Pro Asn Val Phe lie 
50 55 60 

Ala Asp Arg Ser Val Ala Val Asn Lys Gly Arg Leu Lys Arg Leu 
65 70 75 

Gly lie Thr His lie Leu Asn Ala Ala His Gly Thr Gly Val Tyr 
80 85 90 

Thr Gly Pro Glu Phe Tyr Thr Gly Leu Glu lie Gin Tyr Leu Gly 
95 100 105 

Val Glu Val Asp Asp Phe Pro Glu Val Asp He Ser Gin His Phe 
110 115 120 

Arg Lys Ala Tyr Cys His Tyr lie He Phe Ser Cys Val Phe He 
125 130 135 

Ser Gly Lys Val Leu Val Ser Ser Glu Met Gly He Ser Arg Ser 
• 140 145 150 

Ala Val Leu Val Val Ala Tyr Leu Met He Phe His Asn Met Ala 
155 160 . 165 

He Leu Glu Ala Leu Met Thr Val Arg Lys Lys Arg Ala He Tyr 
170 175 180 

Pro Asn Glu Gly Phe Leu Lys Gin Leu Arg Glu Leu Asn Glu Lys 
185 190 195 

Leu Met Glu Glu Arg Glu Glu Asp Tyr Gly Arg Glu Gly Gly Ser 
200 205 210 

Ala Glu Ala Glu Glu Gly Glu Gly Thr Gly Ser Met Leu Gly Ala 
215 220 225 

Ara Val His Ala Leu Thr Val Glu Glu Glu Asp Asp Ser Ala Ser 
230 235 240 

His Leu Ser Gly Ser Ser Leu Gly Lys Ala Thr Gin Ala Ser Lys 
245 250 255 

Pro Leu Thr Leu He Asp Glu Glu Glu Glu Glu Lys Leu Tyr Glu 
260 265 270 

Gin Trp Lys Lys Gly Gin Gly Leu Leu Ser Asp Lys Val Pro Gin 
275 280 285 

Asp Gly Gly Gly Trp Arg Ser Ala Ser Ser Gly Gin Gly Gly Glu 
290 295 300 

Glu Leu Glu Asp Glu Asp Val Glu Arg He He Gin Glu Trp Gin 
305 310 315 

Ser Arg Asn Glu Arg Tyr Gin Ala Glu Gly Tyr Arg Arg Trp Gly 
320 325 • 330 

ArcT Glu Glu Glu Lys Glu Glu Glu Ser Asp Ala Gly Sfer Ser Val 
335 340 345 

Gly Arg Arg Arg Arg Thr Leu Ser Glu Ser Ser Ala Trp Glu Ser 
350 355 360 

Val Ser Ser His Asp He Trp Val Leu Lys Gin Gin Leu Glu Leu 
365 370 .375 

Asn Arg Pro Asp His Gly Arg Arg Arg Arg Ala Asp Ser Met Ser 
380 385 390 

Ser Glu Ser Thr Trp Asp Ala Trp Asn Glu Arg Leu Leu Glu He 
395 400 405 

Glu Lys Glu Ala Ser Arg Arg Tyr His Ala Lys Ser Lys Arg Glu 
410 415 420 

Glu Ala Ala Asp Arg Ser Ser Glu Ala Gly Ser Arg Val Arg Glu 
425 430 435 

Asp Asp Glu Asp Ser Val Gly Ser Glu Ala Ser Ser Phe Tyr Asn 
440 445 450 

Phe Cys Ser Arg Asn Lys Asp Lys Leu Thr Ala Trp Lys Asp Gly. 

455 460 465 

Arg Ser Arg Glu Ser Asn Leu Asp Phe Thr Arg Lys Thr Trp Glu 
470 475 480 

Arg Glu Thr Ala Ala Val Ser Pro Val Gin Arg Arg Gin 
485 490 
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<210> 5 

<211> 321 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc.feature 

<223> Incyte ID No: 7476687CD1 

<400> 5 

Met Pro Leu Leu Pro Ala Ala Leu Thr Ser Ser Met Leu Tyr Phe 
15 10 15 

Gin Met Val lie Met Ala Gly Thr Val Met Leu Ala Tyr Tyr Phe 
20 25 30 

Glu Tyr Thr Asp Thr Phe Thr Val Asn Val Gin Gly Phe Phe Cys 
35 40 45 

His Asp Ser Ala Tyr Arg Lys Pro Tyr Pro Gly Pro Glu Asp Ser 
50 55 60 

Ser Ala Val Pro Pro Val Leu Leu Tyr Ser Leu Ala Ala Gly Val 
65 70 75 

Pro. Val Leu Val lie He Val Gly Glu Thr Ala Val Phe Cys Leu 
80 85 90 

Gin Leu Ala Thr Arg Asp Phe Glu Asn Gin Glu Lys Thr He Leu 
95 100 105 

Thr Gly Asp Cys Cys Tyr He Asn Pro Leu Val Arg Arg Thr Val 

110 115 120 

Arg Phe Leu Gly He Tyr Thr Phe Gly Leu Phe Ala Thr Asp lie 

125 130 135 

Phe Val Asn Ala Gly Gin Val Val Thr Gly Asn Leu Ala Pro His 

140 145 150 

Phe Leu Ala Leu Cys Lys Pro Asn Tyr Thr Ala Leu Gly Cys Gin 

155 160 165 

Gin Tyr Thr Gin Phe He Ser Gly Glu Glu Ala Cys Thr Gly Asn 

170 175 180 

Pro Asp Leu He Met Arg Ala Arg Lys Thr Phe Pro Ser Lys Glu 

185 190. 195 

Ala Ala Leu Ser Val Tyr Ala Ala Met Tyr Leu Thr Met Tyr He 

200 205 210 

Thr Asn Thr He Lys Ala Lys Gly Thr Arg Leu Ala Lys Pro Val 

215 220 225 

Leu Cys Leu Gly Leu Met Cys Leu Ala Phe Leu Thr Gly Leu Asn 

230 235 240 

Arg Val Ala Glu Tyr Arg Asn His Trp Ser Asp Val He Ala Gly 

245 250 255 

Phe Leu Val Gly He Ser He Ala Val Phe Leu Val Val Cys Val 

260 265 270 

Val Asn Asn Phe Lys Gly Arg Gin Ala Glu Asn Glu His He His 

275 280 285 

Met Asp Asn Leu Ala Gin Met Pro Met He Ser He Pro Arg Val 

290 295 300 

Glu Ser Pro Leu Glu Lys Val Thr Ser Val Gin Asn His He Thr 

305 310 315 

Ala Phe Ala Glu Val Thr 

320 



<210> 6 

<211> 426 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7480440CD1 
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<400> 6 
Met Ala Gly 
1 

Leu Leu Leu 
Pro Leu Val 
Pro Leu Ala 
Thr Leu Trp 
Arg Gin Gin 
Ala Phe Leu 
Ser Thr Asp 
Ala Gly Leu 
Trp Arg Pro 
Leu Leu Arg 
Leu Arg Glu 
Gly Trp Thr 
Ser Leu Val 
Thr Leu Met 
Ala Ser Pro 
Asp lie Gly 
Gin Leu Thr 
Ser Arg Val 
Ser Ala His 
Leu Tyr Asp 
Glu Phe Arg 
Val Thr Val 
Leu Pro Leu 
Arg Phe Tyr 
Val Ser Cy.s 
Val Val Pro 
Ser Leu Gly 
Ala Leu Gly 



Leu Gly Phe 
5 

Leu Leu Val 
20 

Phe Val Ala 
35 

Ser Tyr Pro 
50 

Pro Arg Gly 
65 

Leu Glu Leu 
80 

Ser Pro Glu 
95 

Phe Asp Arg 

110 
Phe Pro Glu 

125 
lie Pro Val 

140 
Phe Pro Met 

155 
Ala Thr Glu 

170 
Gly Phe Leu 

185 
Gly Glu Pro 

200 
Cys Gin Gin 

215 
Asp Val Leu 

230 
Ala His Val 

245 
Gly Gly He 

260 
Gin Arg Leu 

275 
Asp Ser Thr 

290 
Gly His Thr 

305 
Lys His Leu 

320 
Ser Leu Phe 

335 
Ser Leu Pro 

350 
Gin Leu Thr 

365 
His Gly Pro 

380 
lieu Leu Ala 

395 
Leu Gly Leu 

410 
Gly Pro Val 

425 



Trp Gly His 
Leu Pro Pro 
Leu Val Phe 
Met Asp Pro 
Leu Gly Gin 
Gly Arg Phe 
Tyr Arg Arg 
Thr Leu Glu 
Ala Ala Pro 
His Thr Val 
Arg Ser Cys 
Ala Ala Glu 
Ser Arg Leu 
Leu Arg Arg 
Ala His Gly 
Arg Thr Leu 
Gly Pro Pro 
Leu Leu Asn 
Gly Leu Pro 
Leu Leu Ala 
Pro Pro Tyr 
Gly Asn Pro 
Tyr Arg Asn 
Gly Cys Pro 
Ala Pro Ala 
Tyr Glu Ala 
Gly Ala Val 
Leu Ala Trp 



Pro 


Ala 


Gly 


10 






Arg 


Ala 


Leu 


25 






Arg 


His 


Gly 


40 






His 


Lys 


Glu 


55 






Leu 


Thr 


Thr 


70 






Leu 


Arg 


Ser 


85 






Glu 


Glu 


Val 


100 






Ser 


Ala 


Gin 


115 






Gly 


Ser 


Pro 


130 






Pro 


Val 


Ala 


145 






Pro 


Arg 


Tyr 


160 








Gin 


Glu 


175 






Glu 


Asn 


Phe 


190 






Ala 


Trp 


Lys 


205 






Leu 


Pro 


Leu 


220 






Ala 


Gin 


lie 


235 






Arg 


Ala 


Ala 


250 






Ala 


He 


Leu 


265 






Leu 


Lys 


Met 


280 






Leu 


Gin 


Gly 


295 






Ala 


Ala 


Cys 


310 






Ala 


Lys 


Asp 


325 






Asp 


Ser 


Ala 


340 






Ala 


Pro 


Cys 


355 






Arg 


Pro 


Pro 


370 






Ala 


He 


Pro 


385 






Ala 


Val 


Leu 


400 






Arg 


Pro 


Gly 


415 







Pro 


Leu 


Leu 






15 


Pro 


Glu 


Gly 






30 


Asp 


Arg 


Ala 






45 


Val 


Ala 


Ser 






60 


Glu 


Gly 


Val 






75 


Arg 


Tyr 


Glu 






90 


Tyr 


lie 


Arg 






105 


Ala 


Asn 


Leu 






120 


Glu 


Ala 


Arg 






135 


Glu 


Asp 


Lys 






150 


His 


Glu 


Leu 






165 


Ala 


Leu 


Glu 






180 


Thr 


Gly 


Leu 






195 


Val 


Leu 


Asp 






210 


Pro 


Ala 


Trp 






225 


Ser 


Ala 


Leu 






240 


Glu 


Lys 


Ala 






255 


Ala 


Asn 


Phe 






270 


Val 


Met 


Tyr 






285 


Ala 


Leu 


Gly 






300 


Leu 


Giy 


Phe 






315 


Gly 


Gly 


Asn 






330 


His 


Leu 


Pro 






345 


Pro 


Leu 


Gly 






360 


Ala 


His 


Gly 






375 


Pro 


Ala 


Pro 






390 


Val 


Ala 


Leu 






405 


Cys 


Leu 


Arg 






420 



<210> 7 
<211> 665 
<212> PRT 
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<213> Hoiao sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7480570CD1 

<400> 7 , ^ ^, » T 

Met Ala His Glu Met lie Gly Thr Gin He Val Thr Glu Arg Leu 
15 10 15 

val Ala Leu Leu Glu Ser Gly Thr Glu Lys Val Leu Leu He Asp 
20 25 30 

Ser Arg Pro Phe Val Glu Tyr Asn Thr Ser His He Leu Glu Ala 
35 40 45 

He Asn He Asn Cys Ser Lys Leu Met Lys Arg Arg Leu Gin Gin 
50 55 60 

Asp Lys Val Leu He Thr Glu Leu He Gin His Ser Ala Lys His 
65 70 75 

Lys val Asp He Asp Cys Ser Gin Lys Val Val Val Tyr Asp Gin 
80 85 90 

Ser Ser Gin Asp Val Ala Ser Leu Ser Ser Asp Cys Phe Leu Thr 
95 100 105 

Val Leu Leu Gly Lys Leu Glu Lys Ser Phe Asn Ser Val His Leu 
110 115 120 

Leu Ala Gly Gly Phe Ala Glu Phe Ser Arg Cys Phe Pro Gly Leu 
125 130 135 

Cys Glu Gly Lys Ser Thr Leu Val Pro Thr Cys He Ser Gin Pro 
140 145 150 

Cys Leu Pro Val Ala Asn He Gly Pro Thr Arg He Leu Pro Asn 
.155 160 165 

Leu Tyr Leu Gly Cys Gin Arg Asp Val Leu Asn Lys Glu Leu Met 
170 175 180 

Gin Gin Asn Gly He Gly Tyr Val Leu Asn Ala Ser Asn Thr Cys 
185 190 195 

Pro Lys Pro Asp Phe He Pro Glu Ser His Phe Leu Aig Val Pro 
200 205 210 

Val Asn Asp Ser Phe Cys Glu Lys He Leu Pro Trp Leu Asp Lys 
215 220 225 

Ser val Asp Phe He Glu Lys Ala Lys Ala Ser Asn Gly Cys Val 
230 235 240 

Leu Val His Cys Leu Ala Gly He Ser Arg Ser Ala Thr He Ala 
245 250 255 

He Ala Tyr He Met Lys Arg Met Asp Met Ser Leu Asp Glu Ala 
260 265 270 

Tyr Arg Phe Val Lys Glu Lys Arg Pro Thr He Ser Pro Asn Phe 
275 280 285 

Asn Phe Leu Gly Gin Leu Leu Asp Tyr Glu Lys Lys He Lys Asn 
290 295 300 

Gin Thr Gly Ala Ser Gly Pro Lys Ser Lys Leu Lys Leu Leu His 
305 310 315 

Leu Glu Lys Pro Asn Glu Pro Val Pro Ala Val Ser Glu Gly Gly 
320 325 330 

Gin Lys Ser Glu Thr Pro Leu Ser Pro Pro Cys Ala Asp Ser Ala 
335 340 345 

Thr Ser Glu Ala Ala Gly Gin Arg Pro Val His Pro Ala Ser Val 
350 355 360 

Pro Ser Val Pro Ser Val Gin Pro Ser Leu Leu Glu Asp Ser Pro 
365 . 370 375 

Leu Val Gin Ala Leu Ser Gly Leu His Leu Ser Ala Asp Arg Leu 
380 385 390 

Glu Asp Ser Asn Lys Leu Lys Arg Ser Phe Ser Leu Asp He Lys 
395 400 405 

Ser Val Ser Tyr Ser Ala Ser Met Ala Ala Ser Leu His Gly Phe 
410 415 420 
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Ser Ser Ser Glu Asp Ala Leu Glu Tyr Tyr Lys Pro Ser a?hr Thr 

425 430 435 

Leu Asp Gly Thr Asn Lys Leu Cys Gin Phe Ser Pro Val Gin Glu 

440 445 . 450 

Leu Ser Glu Gin Thr Pro Glu Thr Ser Pro Asp Lys Glu Glu Ala 

455 460 465 

Ser He Pro Lys Lys Leu Gin Thr Ala Arg Pro Ser Asp Ser Gin 

470 475 480 

' Ser Lys Arg Leu His Ser Val Arg Thr Ser Ser Ser Gly Thr Ala 

485 490 495 

Gin Arg Ser Leu Leu Ser Pro Leu His Arg Ser Gly Ser Val Glu 

500 505 510 

Asp Asn Tyr His Thr Ser Phe Leu Phe Gly Leu Ser Thr Ser Gin 
' 515 520 525 

Gin His Leu Thr Lys Ser Ala Gly Leu Gly Leu Lys Gly Trp His 

530 535 540 

Ser Asp He Leu Ala Pro Gin Thr Ser Thr Pro Ser Leu Thr Ser 

545 550 555 

Ser Trp Tyr Phe Ala Thr Glu Ser Ser His Phe Tyr Ser Ala Ser 

560 565 570 

Ala He Tyr Gly Gly Ser Ala Ser Tyr Ser Ala Tyr Ser Cys Ser 

575 . 580 585 

Gin Leu Pro Thr Cys Gly Asp Gin Val Tyr Ser Val Arg Arg Arg 

590 595 600 

Gin Lys Pro Ser Asp Arg Ala Asp Ser Arg Arg Ser Trp His Glu 

605 610 615 

Glu Ser Pro Phe Glu Lys Gin Phe Lys Arg Arg Ser Cys Gin Met 

62Q 625 630 

Glu Phe Gly Glu Ser He Met Ser Glu Asn Arg Ser Arg Glu Glu 

635 640 645 

Leu Gly Lys Val Gly Ser Gin Ser Ser Phe Ser Gly Ser Met Glu 

650 655 660 

He He Glu Val Ser 

665 

<210> 8 

<211> 254 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 4555838CD1 

<400> 8 

Met Ala Ala Val Ala Ala Thr Ala Ala Ala Lys Gly Asn Gly Gly 
15 10 15 

Gly Gly Gly Arg Ala Gly Ala Gly Asp Ala Ser Gly Thr Arg Lys 

20 25 30 

Lys Lys Gly Pro Gly Pro Pro Ala Thr Ala Tyr Leu Val He Tyr 

35 40 45 

Asn Val Val Met Thr Ala Gly Trp Leu Val He Ala Val Gly Leu 

50 55 60 

Val Arg Ala Tyr Leu Ala Lys Gly Ser Tyr His Ser Leu Tyr Tyr 

65 70 75 

Ser He Glu Lys Pro Leu Lys Phe Phe Gin Thr Gly Ala Leu Leu 

80 85 90 

Glu He Leu His Cys Ala He Gly He Val Pro Ser Ser Val Val 

95 100 105 

Leu Thr Ser Phe Gin Val Met Ser Arg Val Phe Leu He Trp Ala 
110 115 120 

Val Thr His Ser Val Lys - Glu Val Gin Ser Glu Asp Ser Val Leu 
125 130 135 
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Leu 


Pne 


Val lie 


Ala 










Pne 


Tyr 


Thr Pne 


Ser 










Trp 


Ala 


Arg Tyr 


Thr 








170 


Ser 


Gly 


Glu Leu 


Leu 








1 oc 
lo3 


Gin 


Ala 


Gly Leu 


Tyr 








200 


Ser 


Phe 


Asp Tyr 


Tyr 








215 


He 


Pro 


He Phe 


Pro 








230 


Arg 


Lys 


He Leu 


Ser 








245 



<210> 9 

<211> 267 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> inisc_feature 
<223> Incyte ID Wo: 



<400> 9 








Met 


Ser 


Gly 


Cys 


Phe 


1 








5 


Asp 


Gly 


Arg 


Met 


Ala 
20 


Phe 


Asp 


Phe 


Asp 


Glu 
35 


He 


Val 


Arg 


Ala 


Ala 
50 


Ala 


Thr 


Tyr 


Arg 


Glu 
65 


Phe 


Lys 


Tyr 


Leu 


Gly 
80 


Ala 


He 


Tyr 


Glu 


Ala 
95 


Leu 


Gin 


Phe 


Val 


Ala 
110 


He 


Ser 


Asp 


Ala 


Asn 
125 


Ala 


Gly 


His 


His 


Ser 
140 


Gly 


Pro 


Asp 


Ala 


Arg 
155 


His 


Ser 


Cys 


Ala 


Arg 
170 


Leu 


Ser 


Asp 


Tyr 


Leu 
185 


Glu 


Arg 


Leu 


Phe 


Tyr 
200 


Met 


Gly 


Leu 


Leu 


Ala 
215 


Tyr 


Pro 


Met 


His 


Arg 
230 


Ser 


Ser 


Phe 


Arg 


Ala 
245 


Val 


Arg 


Leu 


His 


Leu 
260 



Trp 


rriVi 

xnr 


xilJ. 


vrXU. 










Leu 


Leu 


ASn 11X5 


Leu 








XDU 


T AVI 

Leu 


r'ne 


j.j.e vax 


Leu 








1 

± ID 


Thr 


He 


Tyr AJLa 


AJLa 








1 on 


Ser 


He 


Ser Leu 


Pro 








205 


Ala 


Phe 


Leu He 


Leu 








220 


Gin 


Leu 


Tyr Phe 


His 








235 


His 


Thr 


Glu Glu 


His 








250 



636866CD1 



Pro 


Val 


Ser Gly 


Leu 








10 


Ala 


Gin 


Gly Ala 


Pro 








25 


Thr 


He 


Val Asp 


Glu 








40 


Pro 


Gly 


Gin Arg 


Leu 








55 


Gly 


Phe 


Tyr Asn 


Glu 








70 


Glu 


Gin 


Gly Val 


Arg 








85 


He 


Pro 


Leu Ser 


Pro 








100 


Lys 


Gin 


Gly Ala 


Cys 








115 


Thr 


Phe 


Gly Val 


Glu 








130 


Leu 


Phe 


Arg Arg 


He 








145 


Gly 


Leu 


Leu Ala 


Leu 








160 


Cys 


Pro 


Ala Asn 


Met 








175 


Arg 


Glu 


Arg Ala 


His 








190 


Val 


Gly 


Asp Gly 


Ala 








205 


Gly 


Gly 


Asp Val 


Ala 








220 


Leu 


He 


Gin Glu 


Ala 








235 


Ser 


Val 


Val Pro 


Trp 








250 


Gin 


Gin 


Val Leu 


Lys 








265 



XX6 XXc 


lyr 


Coy* 

OCX 






1 SO 
X^ V 


D^T'rt ^Hfv* 

irro xyj. xieu 


Tl *a 
XX6 


Lys 








fl^w* ^3^^^^ TUT^^^ 

lyx Jrro riec 


vjxy 


vax 






xou 


jjeu r'ro Jrne 


vax 


Arg 






195 


Asn Lys Tyr 


Asn 


Phe 






210 


He Met He 


Ser 


Tyr 






225 


Met He His 


Gin 


Arg 






240 


Lys Lys Phe 


Glu 





Arg Cys Leu 


Ser 


Arg 
15 


Arg Phe Leu 


Leu 


Thr 
30 


Asn Ser Asp 


Asp 


Ser 
45 


Pro Glu Ser 


Leu 


Arg 
60 


Tyr Met Gin 


Arg 


Val 
75 


Pro Arg Asp 


Leu 


Ser 
90 


Gly Met Ser 


Asp 


Leu 
105 


Phe Glu Val 


He 


Leu 
120 


Ser Ser Leu 


Arg 


Ala 
135 


Leu Ser Asn 


Pro 


Ser 
150 


Arg Pro Phe 


His 


Thr 
165 


Cys Lys His 


Lys 


Val 
180 


Asp Gly Val 


His 


Phe 
195 


Asn Asp Phe 


Cys 


Pro 

210 


Phe Pro Arg 


Arg 


Gly 
225 


Gin Lys Ala 


Glu 


Pro 
240 


Glu Thr Ala 


Ala 


Asp 
255 


Ser Cys 
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<210> 10 

<211> 329 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7475576CD1 



<400> 10 

Met Gin Gly Gin Thr Val Val Pro I*ys Asp Ser Tyr Thr lie Ser 
1 5 * 10 15 

Leu lie Gin Arg Leu Arg Gly Arg Glu Ala Ala Arg Arg Thr His 
20 25 30 

Glu Asn Leu Leu Arg Leu Ser Ala Leu Val Arg Ser Pro Gin Thr 
35 40 45 

Ala Ser He Asp Cys His Tlir Trp Ser Val Ser Ser Gly Thr Asn 
50 55 60 

Thr Ser Leu Gin Ala Ser Gly Leu Gly Arg Gin Gly Ser Cys Asp 
65 70 75 

Arg He Ala Ser Arg Ala Ala Ser Tip Gly Cys Thr Arg Thr Ala 
80 85 90 

Ala Pro Gly He Met Gly Asn Gly Met Thr Lys Val Leu Pro Gly 
95 100 105 

Leu Tyr Leu Gly Asn Phe He Asp Ala Lys Asp Leu Asp Gin Leu 

110 115 120 

Gly Arg Asn Lys. He Thr His He He Ser He His Glu Ser Pro 

125 130 135 

Gin Pro Leu Leu Gin Asp He Thr Tyr Leu Arg He Pro Val Ala 

140 145 150 

Asp Thr Pro Glu Val Pro He Lys Lys His Phe Lys Glu Cys He 

155 160 165 

Asn Phe He His Cys Cys Arg Leu Asn Gly Gly Asn Cys Leu Val 

170 ■ 175 180 

His Cys Phe Ala Gly He Ser Arg Ser Thr Thr He Val Thr Ala 

185 190 195 

Tyr Val Met Thr Val Thr Gly Leu Gly Trp Arg Asp Val Leu Glu 

200 205 210 

Ala He Lys Ala Thr Arg Pro He Ala Asn Pro Asn Pro Gly Phe 

215 220 225 

Arg Gin Gin Leu Glu Glu Phe Gly Trp Ala Ser Ser Gin Lys Leu 

230 235 240 

Arg Arg Gin Leu Glu Glu Arg Phe Gly Glu Ser Pro Phe Arg Asp 

245 250 255 

Glu Glu Glu Leu Arg Ala Leu Leu Pro Leu Cys Lys Arg Cys Arg 

260 265 270 

Gin Gly Ser Ala Thr Ser Ala Ser Ser Ala Gly Pro His Ser Ala 

275 280 285 

Ala Ser Glu Gly Thr Leu Gin Arg Leu Val Pro Arg Thr Pro Arg 

290 295 300 

Glu Ala His Arg Pro Leu Pro Leu Leu Ala Arg Val Lys Gin Thr 

305 310 315 

Phe Ser Cys Leu Pro Arg Cys Leu Ser Arg Lys Gly Gly Lys 

320 325 



<210> 11 

<211> 1845 

<212> DMA 

<213> Homo sapiens 



<220> 

<221> misc_feat\are 

<223> Incyte ID No: 1905692CB1 
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<400> 11 

cacaaatgaa ttatcaggag tgaacccaga ggcacgtatg aatgaaagtc ctgatccgac 60 
tgacctggcg ggagtcatca ttgagctcgg ccccaatgac agtccacaga caagtgaatt 120 
taaaggagca accgaggagg cacctgcgaa agaaagccca cacacaagtg aatttaaagg 180 
agcagcccgg gtgtcaccta tcagtgaaag tgtgttagca cgactttcca agtttgaaga 240 
tgaagatgct gaaaatgttg cttcatatga cagcaagatt aagaaaattg tgcattcaat 300 
tgtatcatcc tttgcatttg gactatttgg agttttcctg gtcttactgg atgtcactct 360 
catccttgcc gacctaattt tcactgacag caaactttat attcctttgg agtatcgttc 420 
tatttctcta gctattgcct tattttttct catggatgtt cttcttcgag tatttgtaga 480 
aaggagacag cagtattttt ctgacttatt taacatttta gatactgcca ttattgtgat 540 
tcttctgctg gttgatgtcg tttacatttt ttttgacatt aagttgctta ggaatattcc 600 
cagatggaca catttacttc gacttctacg acttattatt ctgttaagaa tttttcatct 660 
gtttcatcaa aaaagacaac ttgaaaagct gataagaagg cgggtttcag aaaacaaaag 720 
gcgatacaca agggatggat ttgacctaga cctcacttac gttacagaac gtattattgc 780 
tatgtcattt ccatcttctg gaaggcagtc tttctataga aatccaatca aggttattcc 840 
ctatagagat atgacataca tattatttat tttaggtgaa agagcttacg atcctaagca 900 
cttccataat agggtcgtta gaatcatgat tgatgatcat aatgtcccca ctctacatca 960 
gatggtggtt ttcaccaagg aagtaaatga gtggatggct caagatcttg aaaacatcgt 1020 
agcgattcac tgtaaaggag gcacagatag aacaggaact atggtttgtg ccttccttat 1080 
tgcctctgaa atatgttcaa ctgcaaagga aagcctgtat tattttggag aaaggcgaac 1140 
agataaaacc cacagcgaaa aatttcaggg agtaaaaact ccttctcaga agagatatgt 1200 
tgcatatttt gcacaagtga aacatctcta caactggaat ctccctccaa gacggatact 1260 
ctttataaaa cacttcatta tttattcgat tcctcgttat gtacgtgatc taaaaatcca 1320 
aatagaaatg gagaaaaagg ttgtcttttc cactatttca ttaggaaaat gttcggtact 1380 
tgataacatt acaacagaca aaatattaat tgatgtattc gacggtccac ctctgtatga 1440 
tgatgtgaaa gtgcagtttt tctcttcgaa tcttcctaca tactatgaca attgctcatt 1500 
ttacttctgg ttgcacacat cttttattga aaataacagg ctttatctac caaaaaatga 1560 
attggataat ctacataaac aaaaagcacg gagaatttat ccatcagatt ttgccgtgga 1620 
gatacttttt ggcgagaaaa tgacttccag tgatgttgta gctggatccg attaagtata 1680 
gctccccctt ccccttctgg gaaagaatta tgttctttcc aaccctgcca catgttcata 1740 
tatcctaaat ctatcctaaa tgttccttga agtatttatt tatgtttata tatgtttata 1800 
tatgttcttc ataaatctat tacatatata tagataaaaa aaaaa 1845 

<210> 12 

<211> 2451 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7476908CB1 

<400> 12 

ccggacccgg cgagccttcg gggcgcgcgt cgctggtggt ggttgaggct ctagcgataa 60 
taaatgatag aggatacaat gactttgctg tctctgctgg gtcgcatcat gcgctacttc 120 
ttgctgagac ccgagacgct tttcctgctg tgcatcagct tggctctatg gagttacttc 180 
ttccacaccg acgaggtgaa gaccatcgtg aagtccagcc gggacgccgt gaagatggtg 240 
aagagcaagg tagccgagac catgcagaac gatcgactcg gggggcttga tgtgctcgag 300 
gccgagtttt ccaagacctg ggagttcaag aaccacaacg tggcggtgta ctccatccag 360 
ggccggagag accacatgga ggaccgcttc gaagttctca cggatctggc caacaagacg 420 
cacccgtcca tcttcgggat cttcgacggg cacgggggag agactgcagc tgaatatgta 480 
aaatctcgac tcccagaggc tcttaaacag catcttcagg actacgagaa agacaaagaa 540 
aatagtgtat tatcttacca gaccatcctt gaacagcaga ttttgtcaat tgaccgagaa 600 
atgctagaaa aattgactgt atcctatgat gaagcaggca caacgtgttt gattgctctg 660 
ctatcagata aagacctcac tgtggccaac gtgggtgact cgcgcggggt cctgtgtgac 720 
aaagatggga acgctattcc tttgtctcat gatcacaagc cttaccagtt gaaggaaaga 780 
aagaggataa agagagcagg tggtttcatc agtttcaatg gctcctggag ggtccaggga 840 
atcctggcca tgtctcggtc cctgggggat tatccgctga aaaatctcaa cgtggtcatc 900 
ccagacccag acatcctgac ctttgacctg gacaagcttc agcctgagtt catgatcttg 960 
gcatcagatg gtctctggga tgctttcagc aatgaagaag cagttcgatt catcaaggag 1020 
cgcttggatg aacctcactt tggggccaag agcatagttt tacagtcatt ttacagaggc 1080 
tgccctgaca atataacagt catggtggtg aagttcagaa atagcagcaa aacagaagag 1140 
cagtgaaccc ttcaggggtc tcagctgcct tagactaaag gactttcaac acactggtct 1200 
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cttttaattt agtgaaaagt gtgggagttg 
tcccccctcc ctggtggtct taggtctata 
aatgtagtta agaaactgga aaatggtttc 
tccaaaatat ataagtaaat agctgtagag 
catttagtct ccctgaagat tcttttcaag 
gttgtgtcct ctctttgtaa cagtggacag 
ccactgcgtc acctgtgagt ggtcaggggc 
gagctgtcct gagaatgctt tgtccttctg 
gaagcagatt tggaatggtt tattattttg 
gaatgagaga ccagtggcaa ctgcctgcac 
ttaatagtta aatagacttt gtataccacc 
tgcatgaccg ttaacctttt gcttagtcct 
accaacagat accctgattt ttcatcttac 
aaaaagcaag ccacaatacc atgattcctt 
actgaatgag tcacagtgtt ccctggcaag 
tggtgaggtt accaccctga attgagctcc 
ctttccaagt ggttttcaag tgtcaggcag 
tgtgtgttcc ttgaagccag gtgcagagtt 
aaactactgg gataagcttc tccttgacaa 
gcaaatctcc atccacatca gggagctttc 
gcaggcctgc aggggaggca gcaaagggac 

<210> 13 

<211> 1105 

<212> DNA 

<213> Homo sapiens 



taattaggat catccacccc agacatggaa 1260 
atcagtgacg aacagagggt gcccttggcc 1320 
ttcatgtttt cccaactctt tcatccagtg 1380 
tcacatatat gaagtgaata gcatatgtgt 1440 
atcctgttca gggtcctcca ggcatcagct 1500 
gacagaccac ccagtgctgc aggagacagg 1560 
tgatgtggca acaccctctg ccaagagaca 1620 
agcccatgtt ttctgctcag tagcagcttg 1680 
gctgctcttg gggactgcga gaagcagaga 1740 
agcagagata accctcttcc cttgcttcct 1800 
tgaccagcct ttgtgcattt atcctaatca 1860 
taccatatgt aataggcagc tgttaaattc 1920 
gtgaccaaga aaccacgtta ggggaaatga 1980 
ccattttcaa cagtagatga aggaaatgat 2040 
taagctgttt gcattgagaa aggagtgagc 2100 
agctgccagt ttttgtgttt ttccttgccc 2160 
tgttctgaga agcagcagcc tataactgta 2220 
cccagctact gcagcttggg atttggtggg 2280 
tggaaaggca gcagtcttca acatttggtt 2340 
cccaggcaaa tacaaaccgc cccgtggcct 2400 
ctggcagttg ceiacacagta a 2451 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7708162CB1 



<400> 13 . 

agtgtgctgg aaaggtttag aatgcctctt 
agtgctcata attcagcatt tgcagaagga 
tatgtccaat gttggagacc ccaggaatat 
atttaatcag acagttggaa ccaagatgat 
tcttatattt aaatggatat tatttggtca 
gatttaccca aatcactcaa gtccatgcct 
tccaggaagt ccatctggcc atgcaatggg 
cgctgccctg agccacactg tctgtgggat 
gacctggtca tttctttgga gtgttttttg 
agtattcata gcaacacatt ttcctcatca 
ggtggcagag gcctttgaac acactccagg 
gaagaccaac ctctttctct tcctgtttgc 
caacattgac ctgctgtggt ccgtgcccat 
gatccacatt gacaccacgc cttttgctgg 
cttgggcttt gcaatcaact cagagatgtt 
cacactgagc ttccggttgc tctgtgcctt 
tttcctccag atcccgactc acgaagagca 
tgcatccatt cccctaactg tggttgcttt 
acaaagcgga aagaagagtc agtag 



tttcaagatg gatttccttc acaggaatgg 60 
ctaccgagct tactacactt ttctaaattt 120 
ctttttcatt tattttccac tttgttttca 180 
atgggtagca gtcattgggg attggttaaa 240 
tcgaccttac tggtgggtcc aagaaactca 300 
tgaacagttc cctactacat gtgaaacagg 360 
cgcatcctgt gtctggtatg tcatggtaac 420 
ggataagttc tctatcactc tgcacagact 480 
gttgattcaa atcagtgtct gcatctccag 540 
agttattctt ggagtaattg gtggcatgct 600 
catccaaacg gccagtctgg gcacatacct 660 
agttggcttt tacctgcttc ttagggtgct 720 
agccaaaaag tggtgtgcta accccgactg 780 
actcgtgaga aaccttgggg tcctctttgg 840 
cctcctgagc tgccgagggg gaaataacta 900 
gacctcattg acaatactgc agctctacca 960 
tttattttat gtgctgtctt tttgtaaaag 1020 
cattccctac tctgttcata tgttaatgaa 1080 

1105 



<210> 14 

<211> 1730 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7473603CB1 



<400> 14 

atgctggagt ctgctgaaca gctgctggtg gaggacctgt acaaccgcgt cagggagaag 60 
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atggatgaca ccagcctcta taatacgccc 
caggatcgcc aagaggcgcc ctggaatgag 
gctgacagga gtgtggctgt gaacaagggg 
ctgaatgctg cgcatggcac cggcgtttac 
atccagtacc tgggtgtaga ggtggatgac 
cggaaggcgt actgtcatta catcattttc 
gtcagcagcg aaatgggcat cagccggtca 
ttccacaaca tggccatcct ggaggctttg 
cccaatgagg gcttcctgaa gcagctgcgg 
gaagaggact atggccggga ggggggatca 
agcatgctcg gggccagagt gcacgccctg 
cacctgagtg gctcctccct ggggaaggcc 
gacgaggagg aggaggagaa actgtacgag 
gacaaggtcc cccaggatgg aggtggctgg 
gagctcgagg acgaggacgt ggagaggatc 
taccaagcag aagggtaccg gaggtgggga 
gctggctcct cggtggggag gcggcggcgc 
gtgagcagcc acgacatctg ggtcctgaag 
ggcaggaggc gccgcgcaga ctcgatgtcc 
aggctgctgg agattgagaa ggaggcttcc 
gaggcggcag acaggagctc agaagcaggg 
gtgggctctg aggccagttc cttctacaac 
gcctggaaag atggaagatc aagagaatcc 
cgggagacag cagcggtgag cccggtgcag 
acgtcagcct gacagcctac caggccctgg 
tgagaaccag gaggaggtgg tggagctcag 
gagacgacgg aggctggagc tgctggagag 
attgcagctg ggaggcggac agtccagcgc 

<210> 15 

<211> 2145 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 7476687CB1 

<400> 15 

tcctctgcag cctgccggcg cctcgaagcc 
ctccatcccg cctcccgtgc ctcgtcctcc 
tgcctcggcg gaccggggag ggggcccacc 
ccggcccggg aggcgtgcat gcccctgctg 
ttccagatgg tgatcatggc agggacggtg 
acgttcaccg tgaacgtgca gggcttcttc 
ccgggcccgg aggacagcag cgccgtgccc 
gtccccgtgc tcgtgataat agttggagaa 
agggattttg aaaaccagga aaaaactatt 
ctggtgcgcc gaactgtccg atttcttgga 
atctttgtaa atgctggaca agtagtcaca 
tgtaagccca attatacagc acttggatgt 
gaggcctgta ctggcaaccc agatctcatc 
gaagcagctc tcagtgtcta tgcagctatg 
aaagccaagg gaaccagact tgctaagcca 
tttcttactg gactcaacag agtagcagaa 
ggctttctgg ttggaatatc tatagcagta 
aaagggagac aagcagaaaa tgagcatata 
atcagcattc ctcgagtaga aagtcctttg 
actgccttcg cagaagtcac atgatatcga 
atcccttttt accatccatt cataacaccc 
aaagttgttt aataattttt ataattttaa 
tagactgtga gctcctcgag ggcaggattg 
aacaaagcct agcacaaagt aggtgttcaa 



tgtgtcctgg acctacagcg ggccctggtt 120 
gtggatgagg tctggcccaa tgtcttcata 180 
aggctgaaga ggctgggaat cacccacatt 240 
actggccccg aattctacac tggcctggag 300 
tttcctgagg tggacatttc ccagcatttc 360 
tcttgtgttt tcatttcagg gaaagtcctg 420 
gcagtgctgg tggtcgccta cctgatgatc 480 
atgaccgtgc gtaagaagcg ggccatctac 540 
gagctcaatg agaagttgat ggaggagaga 600 
gctgaggctg aggagggcga gggcactggg 660 
acggtggaag aggaggacga cagcgccagc 720 
acccaggcct ccaagcccct caccctcata 780 
cagtggaaga aggggcaggg cctcctctca 840 
cgctcagcct cctctggcca gggtggggag 900 
atccaggagt ggcagagccg aaacgagagg 960 
agggaggagg agaaggagga ggagagcgac 1020 
accctgagcg agagcagcgc ctgggagagc 1080 
cagcagctgg agctgaaccg cccggaccac 1140 
tcggagagca cctgggacgc atggaacgag 1200 
cggaggtacc acgccaagag caagagagag 1260 
agcagggtgc gggaggatga tgaggacagc 1320 
ttctgcagca ggaacaagga caagctcact 1380 
aatttggatt tcacaagaaa gacttgggag 1440 
aggaggcagt aggggagaag aacccctccg 1500 
aagctgaaac accagaagaa ggtggggcag 1560 
caggggggag gacttggcct tggctaagaa 1620 
aagccggaga acctggagga gagccagtct 1680 
ggggagattc cctgttggtt 1730 



cggacctgcc tccgcctctt cctccagcgg 60 
cgccgcctcc gccgccgcgc ccccggtggc 120 
gcgtcggccg cccgctcggc tcggctcggc 180 
cccgcggcgc tcaccagcag catgctctat 240 
atgctggcgt actacttcga gtatacggac 300 
tgccacgaca gcgcctaccg caaaccctac 360 
cccgtgctcc tctactcgct ggccgccggg 420 
actgctgtct tttgcctaca actagccaca 480 
ttaactggag actgttgcta tataaacccg 540 
atttatacat ttggactgtt tgctacagat 600 
ggaaatctgg ccccacattt ccttgccctg 660 
cagcagtata cacaattcat cagtggggaa 720 
atgagagccc gaaaaacctt tccatccaaa 780 
tatctgacca tgtacatcac caacacaatc 840 
gttctatgct tgggcttaat gtgtttggca 900 
tatcgaaatc attggtcaga tgttatagca 960 
tttctggttg tgtgcgtggt gaataatttc 1020 
cacatggata atctggcaca gatgccaatg 1080 
gaaaaggtaa catctgtaca gaaccacatc 1140 
agcagatggt ttttcactgc attggacatc 1200 
aaagtttgtt tgattgcaag tgaagtttat 1260 
aatcaatgtc aacctatctg tttcccccac 1320 
tctttttttt tctgtgtccc cagcacttac 1380 
caaaaatgtg atgaccaaat gaaaaaaatt 1440 
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cttaaataaa acattcactt tagtttctca 
ctctacataa atcgtatttg tgtatgaaaa 
tcttcatata tctattcagc agtatgccat 
gggggttttt attttgttat tcagcatgac 
gaaattactt tatttttaga aagtatgttc 
atgttatatt tcacttaaaa tactattcat 
tttgaagggg aaaatgccaa tttttactgt 
ttatttggac ttcttctcct gacaattgtg 
gcagtattta gggtagttta aatgagtaaa 
ttcccatggg tagcagtgtc tcaagagtgg 
actaaagatt ccacataatt taaatgggaa 
atataaatgc tagaaatatg gttaatgtat 

<210> 16 

<211> 1352 

<212> DNA 

<213> Homo sapiens 



cagaatcatt gcaattatgt taaaagaaat 1500 
ccttctattt tgggctagtt atttttttaa 1560 
atttaatttg aagtggactt tgaaagtcat 1620 
attatttcca ttcgtaacat ttcagtgtgt 1680 
tatagtaaaa taatgtttcc acattatatt 1740 
actatacatt ctaagactgg tgcttctgct 1800 
aataagtaat gtatcataat taaaaattat 1860 
gcttaattca tgacttgttt ttgaatgcag 1920 
ttcagcactg gtaccttatt attgagtaat 1980 
tcaaaagctc cactcttagg cttttttact 2040 
agaactatac cctgacacat aatttaaatt 2100 
tttactttgc atttg 2145 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7480440CB1 



<400> 16 

atggccggcc tggggttttg gggccaccct 
gtgctgccac cccgggccct gccagaagga 
catggcgacc gggccccgct ggcctcctac 
accctgtggc cacgaggcct gggccagctg 
ctgggccgct tcctgaggag ccgctacgacf 
gaggtgtaca tccgcagcac ggactttgac 
gccgggctgt ttcccgaggc tgctccaggg 
gtgcacacgg tgcccgtggc tgaggataag 
cgataccacg agctgctgcg ggaggccacc 
ggctggacgg gcttcctgag tcgcctggag 
ccactgcgca gggcatggaa ggttctggac 
ccactaccag cctgggcctc cccagatgtc 
gatattggag cccacgtggg cccaccccgg 
atcctgctga atgctatcct tgcaaacttc 
aagatggtca tgtactcagc tcatgacagc 
ctctatgatg gacacacccc gccatatgct 
ctggggaatc ccgccaaaga tggagggaat 
tccgcccacc tgcccctgcc tctcagcctc 
cgcttctacc agctgactgc cccggcccgg 
ccctatgagg ctgccatccc cccagctcca 
gtgctggtgg cactcagctt ggggctgggc 
gccttggggg gccccgtgtg agccagaaac 
accccaacat gtatgctcag tagctgcaaa 

<210> 17 

<211> 3766 

<212> DNA 

<213> Homo sapiens 



gctggacctc tcctgctgct gctgctgctg 60 
cccctggtgt tcgtggctct ggtattccgc 120 
cccatggacc cacacaagga ggtggcctcc 180 
accacggagg gggtccgcca gcagctggag 240 
gccttcctga gtccggagta ccggcgggag 300 
cgcacgctgg agagtgccca ggccaacctt 360 
agccccgagg cccgctggag gccgatcccg 420 
ctgctgaggt tccccatgcg cagctgtccc 480 
gaggccgccg agtaccagga ggccctggag 540 
aacttcacgg gactgtcgct ggttggagag 600 
accctcatgt gccagcaagc ccacggtctt 660 
ctgcggactc ttgcccagat ctcggctttg 720 
gcagcagaga aggcccagct gacagggggg 780 
tcccgggtcc agcgcctggg gctgcccctc 840 
accctgctgg ccctccaggg ggccctgggc 900 
gcctgcctcg gctttgagtt ccggaagcac 960 
gtcaccgtct ccctcttcta ccgcaatgac 1020 
cccgggtgcc cggccccctg tccactaggc 1080 
cctcccgccc atggggtctc ctgccatggc 1140 
gtggtgcccc tgctggccgg agctgtagct 1200 
ctgctggcct ggagaccagg gtgcctgcgg 1260 
cagggcttcc ctacccccag ctgacactgg 1320 
aa 1352 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7480570CB1 

<400> 17 

gaaaagaaga tgaggaggag agcgacggga 
tcggctccgc ggcggcgcct cgcaagtccg 
gtgacaactt tcgtttccct ctgagggaat 
tccagtgtaa agctgttgga gcgcgggagc 
gctccaaagc atcttttgtt gtggaatggt 
tgaggggctg ctttgtggac ggagtccttt 



cgggacgcga gcgggagcgc agccgccctc 60 
ggaggcgagg ggggcccgag gggagacgcc 120 
tgggaggtcg gcggccccaa aagctttcag 180 
aaaggtaaag aatgatgtaa tgcgctggct 240 
tattccagtc atctctttat gaatcaaatg 300 
gcaagagcac atcaacggga aagagaaaga 360 
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gacattcact tggagggctc ttgctgaaaa tgggtttaac tctccttttg ccagtcacca 420 
ccagcctgac ctcatacact tttagtacaa tggagtggct gagcctttga gcacaccacc 480 
attacatpat cgtggcaaat taaagaagga ggtgggaaaa gaggacttat tgttgtcatg 540 
gcccatgaga tgattggaac tcaaattgtt actgagaggt tggtggctct gctggaaagt 600 
ggaacggaaa aagtgctgct aattgatagc cggccatttg tggaatacaa tacatcccac 660 
attttggaag ccattaatat caactgctcc aagcttatga agcgaaggtt gcaacaggac 720 
aaagtgttaa ttacagagct catccagcat tcagcgaaac ataaggttga cattgattgc 780 
agtcagaagg ttgtagttta cgatcaaagc tcccaagatg ttgcctctct ctcttcagac 840 
tgttttctca ctgtacttct gggtaaactg gagaagagct tcaactctgt tcacctgctt 900 
gcaggtgggt ttgctgagtt ctctcgttgt ttccctggcc tctgtgaagg aaaatccact 960 
ctagtcccta cctgcatttc tcagccttgc ttacctgttg ccaacattgg gccaacccga 1020 
attcttccca atctttatct tggctgccag cgagatgtcc tcaacaagga gctgatgcag 1080 
cagaatggga ttggttatgt gttaaatgcc agcaatacct gtccaaagcc tgactttatc 1140 
cccgagtctc atttcctgcg tgtgcctgtg aatgacagct tttgtgagaa aattttgccg 1200 
tggttggaca aatcagtaga tttcattgag aaagcaaaag cctccaatgg atgtgttcta 1260 
gtgcactgtt tagctgggat ctcccgctcc gccaccatcg ctatcgccta catcatgaag 1320 
aggatggaca tgtctttaga tgaagcttac agatttgtga aagaaaaaag acctactata 1380 
tctccaaact tcaattttct gggccaactc ctggactatg agaagaagat taagaaccag 14*40 
actggagcat cagggccaaa gagcaaactc aagctgctgc acctggagaa gccaaatgaa 1500 
cctgtccctg ctgtctcaga gggtggacag aaaagcgaga cgcccctcag tccaccctgt 1560 
gccgactctg ctacctcaga ggcagcagga caaaggcccg tgcatcccgc cagcgtgccc 1620 
agcgtgccca gcgtgcagcc gtcgctgtta gaggacagcc cgctggtaca ggcgctcagt 1680 
gggctgcacc tgtccgcaga caggctggaa gacagcaata agctcaagcg ttccttctct 1740 
ctggatatca aatcagtttc atattcagcc agcatggcag catccttaca tggcttctcc 1800 
tcatcagaag atgctttgga atactacaaa ccttccacta ctctggatgg gaccaacaag 1860 
ctatgccagt tctcccctgt tcaggaacta tcggagcaga ctcccgaaac cagtcctgat 1920 
aaggaggaag ccagcatccc caagaagctg cagaccgcca ggccttcaga cagccagagc 1980 
aagcgattgc attcggtcag aaccagcagc agtggcaccg cccagaggtc ccttttatct 2040 
ccactgcatc gaagtgggag cgtggaggac aattaccaca ccagcttcct tttcggcctt 2100 
tccaccagcc agcagcacct cacgaagtct gctggcctgg gccttaaggg ctggcactcg 2160 
gatatcttgg ccccccagac ctctacccct tccctgacca gcagctggta ttttgccaca 2220 
gagtcctcac acttctactc tgcctcagcc atctacggag gcagtgccag ttactctgcc 2280 
tacagctgca gccagctgcc cacttgcgga gaccaagtct attctgtgcg caggcggcag 2340 
aagccaagtg acagagctga ctcgcggcgg agctggcatg aagagagccc ctttgaaaag 2400 
cagtttaaac gcagaagctg ccaaatggaa tttggagaga gcatcatgtc agagaacagg 2460 
tcacgggaag agctggggaa agtgggcagt cagtctagct tttcgggcag catggaaatc 2520 
attgaggtct cctgagaaga aagacacttg tgacttctat agacaatttt tttttcttgt 2580 
tcacaaaaaa attccctgta aatctgaaat atatatatgt acatacatat atatttttgg 2640 
aaaatggagc tatggtgtaa aagcaacagg tggatcaacc cagttgttac tctcttaaca 2700 
tctgcatttg agagatcagc taatacttct ctcaacaaaa atggaagggc agatgctagg 2760 
atccccccta gacggaggaa aaccatttta ttcagtgaat tacacatcct cttgttctta 2820 
aaaaagcaag tgtctttggt gttggaggac aaaatcccct accattttca cgttgtgcta 2880 
ctaagagatc tcaaatatta gtctttgtcc ggacccttcc atagtacacc ttagcgctga 2940 
gactgagcca gcttgggggt caggtaggta gaccctgtta gggacagagc ctagtggtaa 3000 
atccaagaga aatgatccta tccaaagctg attcacaaac ccacgctcac ctgacagccg 3060 
agggacacga gcatcactct gctggacgga ccattagggg ccttgccaag gtctacctta 3120 
gagcaaaccc agtacctcag acaggaaagt cggggctttg accactacca tatctggtag 3180 
cccattttct aggcattgtg aataggtagg tagctagtca cacttttcag accaattcaa 3240 
actgtctatg cacaaaattc ccgtgggcct agatggagat aatttttttt tcttctcagc 3300 
tttatgaaga gaagggaaac tgtctaggat tcagctgaac caccaggaac ctggcaacat 3360 
cacgatttaa gctaaggttg ggaggctaac gagtctacct ccctctttgt aaatcaaaga 3420 
attgtttaaa atgggattgt caatccttta aataaagatg aacttggttt caagccaaat 3480 
gtgaatttat ttgggttggt agcagagcag cagcaccttc aaattctcag ccaaagcaga 3540 
tgtttttgcc ctttctgctt cactgcatgg atacagttgg taaaatgtaa taatatggca 3600 
gaattttata ggaaacttcc tagggaggta aattatggga agattaagaa aggtacaaat 3660 
tgctgaggag aagcaggaaa cctgtttcct tagtggcttt tatcccctcg gcatgcgatg 3720 
gggctgatgt ttctatgatt gcctcagact ttcacattta ctagta 3766 

<210> 18 

<211> 2656 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> misc^feature 

<223> Incyte ID No: 455583 8CB1 

<220> 

<221> unsxire 
<222> 2532 

<223> a, c, g, or other 
<400> 18 

gccgtggacc ctgtcccggc tccgccccgc 
cctccgcccc ctctcctcgc gtagtccggc 
cgggcacttg acatggcggc agtggcggcg 
ggtgggaggg ccggggccgg ggacgccagc 
ccggccacgg cgtacctggt catctacaat 
gcggttggtc tggtccgagc atacctggct 
attgaaaagc ctttgaaatt ctttcaaact 
ataggaattg ttccatcttc tgttgtcctg 
ctaatatggg cagtaacaca tagcgtcaaa 
tttgttattg catggacgat cacggaaatc 
ttaaaccatc tgccttacct catcaaatgg 
ccaatgggag tgtcaggaga actgctcaca 
gctggcctat attccatcag tttacccaac 
ttcctgattc taataatgat ctcctacatt 
atacaccaga gaagaaagat cctttctcat 
ctgctttctg cacctcccac caaaacaaac 
tttgagttcc caatacgttt catagaaaat 
aaaactaaaa caaaaatcca gtgtcacatg 
ttacataaaa caccctggcc agttcatttc 
atttatgatg gcactagaaa gggatttggc 
ctgatcaatg aagacctgta acactaagta 
gtaccagctg aatagcccag cttgcag1:at 
attcccttgt caaagtgctt gactgcatgc 
tctgttctct ggaatgctct gaagttatgg 
gaattataaa atgtatatgt ctatgaagct 
ttagaaccct tttgtttgtt tccaattgag 
tgaatctaat aagtatgtgt gtaccgtaaa 
tcatgtctta acaaaatgac aggtctcaga 
agtccttttc acagcaccgt tctcagaaca 
tctcactgat gcactgatgg ccctgaagaa 
tgttagtgaa agcacatgga aggtgttgct 
aiatagattta tacaccatta ttgttttatt 
tacttttgct aattaacgtc ctatgttaat 
aaacattccc cttgggctgt catgagctaa 
ccaagaaata gtttggtact accgacatcg 
aattgctcaa ccatgcattt aaaactcctc 
aactgaaaaa aaataagaeia gaaagagttg 
tgcattaagt attaaatagc acagtatctt 
cctgcagctg cctttttttg ggggcagggt 
aagtttagct tattgagtag ttgtcattta 
actggaaaac taaatttttt tttttctctt 
ggagtgcagt ggcgcgatct cggctcactg 
tcctgactca gnctcctgag tagctgggac 
ttttttgttt ttaagaaaaa cggggttcac 
cttgacaccc gctaag 

<210> 19 

<211> 1292 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_feature 



ggccctggcc ccgcctgccc ctcgtccctt 60 
ccgagccgct cgcgctagga gagcgggctt 120 
actgcagcag cgaaggggaa tgggggcggc 180 
ggcacgcgga agaagaaggg cccggggccc 240 
gtggtgatga cagccgggtg gctggttata 300 
aagggtagct accatagcct ttattattca 360 
ggagccttat tggagatttt acattgtgct 420 
acttctttcc aggtgatgtc aagagttttt 480 
gaggtacaga gtgaagacag tgtcctcctg 540 
atccgttact ccttttatac attcagtcta 600 
gccaggtaca cacttttcat tgtgctgtac 660 
atatatgcag ctctgccctt tgtcagacaa 720 
aaatacaatt tctcttttga ctactatgca 780 
ccaatttttc cccagttata cttccacatg 840 
actgaagaac acaagaaatt tgaatagttc 900 
ttttcaatga tcaaaaaatg ctgcagattt 960 
aagtaagaac tatttttaaa atattcaaac* 1020 
ggcctgagat tttattttag aaaaaggttg 1080 
agcatgctct ttcaaccaga agttcttaat 1140 
attttatgtc cttctgtgtc cttcatgtat 1200 
cttgagagtt acagtctgaa taatgaagtc 1260 
agttatgttt cagtctgcag tgtgtttagc 1320 
tggaaacttt gtatttttga agcagcaaac 1380 
ctgggaccta tcccctcaca tctaatgaat 1440 
tcggggtagt gcctgtaatc agaaaacaac 1500 
tcattactgc ctgccactaa gaaacgtgct 1560 
gaatatatct tatctggagc tcagcctcaa 1620 
aagggggagc tcaatagctc aaaagtgaca 1680 
cctctgagta acgtgtttgc cagtagctat 1740 
gcggatccag tcacatagga aaggaggctg 1800 
ttagaaaggt agtcaggaaa aacattcagg 1860 
tttaaatttt cattcactct tctgtttgga 1920 
ttccaccaag ctataagtcc atagtcagta 1980 
aagcagtgtc atctccgcat gttggagcag 2040 
tctaatccat gtcacatcct catacaattt 2100 
aagaaaggat tggtactgca actgtaggta 2160 
gatgaaaatg tgaaagccca agtttagatg 2220 
cttcatggag ccttttttcc tcccccatcc 2280 
gggggttgat gttgaacttt aagagtttaa 2340 
aaatataatt gcgaatatca gaaaactcat 2400 
gagacggagt ctcgctctgt tgcccaggct 2460 
caagctccac ctcccgggtt cacgccatcc 2520 
tacaggtgcc tggcaacaaa cccagctaat 2580 
cgggttaccc agatggtctg atttctgacc 2640 

2656 



17/18 



wo 02/10363 PCT/USOl/23716 
<223> Incyte ID No: 636866CB1 



<400> 19 

cgctgctgca gcagccgcag cgccggccgc 
tttaaagggg acgcggcggc tgcccggggg 
gacgcacatc atcctcagtc cctcgggact 
ccgggggtgg ataagacacc gcgtcccctc 
tgcgccccaa tacctcagct agcccccttc 
acagacctct gctgccgccg cccccacgaa 
ccctacaggt ggtgctcacg gtaatgcacc 
cctccgctgc ctatctaggg acggcaggat 
gaccttcgac ttcgacgaga ctatcgtgga 
cgcgccgggc cagcggctcc cggagagcct 
cgagtacatg cagcgcgtct tcaagtacct 
gagcgccatc tacgaagcca tccctttgtc 
ggcaaaacag ggcgcctgct tcgaggtgat 
ggagagctcg ctgcgcgccg ccggccacca 
gtcggggccg gatgcgcggg gactgctggc 
gcgctgcccc gccaacatgt gcaagcacaa 
ccacgacggc gtgcacttcg agcgcctctt 
ccccatgggg ctgctggcgg gcggcgacgt 
ccgcctcatt caggaggccc agaaggccga 
ctgggaaacg gctgcagatg tgcgcctcca 
ctggccgcct gcaggggggt acccgggcca 
gcaaagacag ctttactact cccttaaaaa 

<210> 20 

<211> 1325 

<212> DNA 

<213> Homo sapiens 



ggctccggct ccggctccgg ctcccgggca 60 
ggatgagggg caagtggagg ggacggctca 120 
ggagggactc gtgagccgga gcccagaaat 180 
caattcccgt aagcacccct tgctccatcc 240 
cccacttctt acactccaaa ctcagccggg 300 
cgtgtgacga cggctggagg ccaacagagt 360 
gacaatgagt ggctgttttc cagtttctgg 420 
ggccgcgcag ggcgcgccgc gcttcctcct 480 
cgaaaacagc gacgattcga tcgtgcgcgc 540 
gcgagccacc taccgcgagg gcttctacaa 600 
gggcgagcag ggcgtgcggc cgcgggacct 660 
gccaggcatg agcgacctgc tgcagtttgt 720 
tctcatctcc gatgccaaca cctttggcgt 780 
cagcctgttc cgccgcatcc tcagcaaccc 840 
tctgcggccg ttccacacac acagctgcgc 900 
ggtgctcagc gactacctgc gcgagcgggc 960 
ctacgtgggc gacggcgcca acgacttctg 1020 
ggccttcccg cgccgcggct accccatgca 1080 
gcccagctcg ttccgcgcca gcgtggtgcc 1140 
cctgcaacag gtgctgaagt cgtgctgagt 1200 
acggcggagg gggcggggaa gggagattcg 1260 
aa 1292 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7475576CB1 



<400> 20 

atgcaggggc 

cggggccgtg 

agatccccac 

acttcgctgc 

gcggcgagct 

aaggtacttc 

ggccgaaata 

gatatcacct 

ttcaaagaat 

cactgctttg 

acggggctag 

cccaacccag 

cgccggcagc 

gcgctgctgc 

gggccgcact 

gaagcccacc 

cggtgtctgt 

cgactggctc 

ctgggggagc 

ctgcagtcag 

tggtgcctta 

ggagggagag 

aaaaa 



agactgtagt 
aggccgcaag 
agacagctag 
aggcgtcggg 
gggggtgcac 
ctggactcta 
agatcacaca 
accttcgcat 
gtatcaactt 
caggcatctc 
gctggcggga 
gctttaggca 
tggaggagcg 
cgctgtgcaa 
cagcagcctc 
ggccgctgcc 
cccgcaaggg 
ccttcggggg 
cccgcggcgg 
cgtccccaac 
gtccttgggc 
tggagggttt 



tccaaaagat 
gagaacccat 
catcgactgc 
cctgggccgt 
ccggaccgcc 
cctcggaaac 
catcatctct 
cccggtcgct 
catccactgc 
tcgcagcacc 
cgtgcttgaa 
gcagcttgaa 
cttcggcgag 
gcgctgccgg 
cgagggaacc 
gctgctggcg 
cggcaagtga 
ctgtctgcgc 
cctgaaccct 
ctgtgcgtct 
tgggggaggg 
gacgggcctg 



tcctacacta 
gagaaccttc 
cacacgtggt 
cagggcagct 
gcccccggga 
ttcattgatg 
atccatgagt 
gatacccctg 
tgccgcctta 
acgattgtga 
gccatcaagg 
gagtttggct 
agccccttcc 
cagggctccg 
ctgcagcgcc 
cgcgtcaagc 
ggatgcagtc 
cttccacgcc 
gcctcccgcg 
ctgtgtccgg 
ggcccaccct 
gagggtatta 



tatcccttat 
ttcggctgtc 
cagtttctag 
gtgaccggat 
tcatgggcaa 
ccaaagacct 
caccccagcc 
aggtacccat 
atggggggaa 
cagcgtatgt 
ccaccaggcc 
gggccagttc 
gcgacgagga 
cgacctcggc 
tggtgccgcg 
agactttctc 
cagccgtggc 
ccccaggacg 
cccgccctgc 
gccggcctgc 
taaaggcggc 
aagagacaca 



ccagaggctg 
tgccctagtg 
tggaaccaat 
cgcttcccgg 
tggcatgacc 
ggatcagctg 
tctgctgcag 
caaaaagcac 
ctgccttgtg 
gatgactgtg 
catcgccaac 
ccagaagctt 
ggagttgcgc 
ctcctccgcc 
cacgccccgg 
ttgcctcccc 
tccccacttc 
ggcccagagg 
tcgtccgcgt 
tgcagccacc 
gggaggggag 
gaagaaaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1325 
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